Warning Please do not post any new comments on this page.
This is a discussion archive first created in , although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

Proofreading, OCR cleanup script...

Would it be possible for the relevant portion of this script to note ¬ as a line continuation marker and remove collapse text in which it appears appropriately? ShakespeareFan00 (talk) 11:46, 1 July 2019 (UTC)

Similar to what was done here https://en.wikisource.org/w/index.php?title=Page%3AThe_Heimskringla%3B_or%2C_Chronicle_of_the_Kings_of_Norway_Vol_1.djvu%2F195&type=revision&diff=9409445&oldid=9409444 ?

ShakespeareFan00 (talk) 11:50, 1 July 2019 (UTC)

I agree, I've had texts where that would be very useful. Maybe that character is frequently substituted for a hyphen by some common OCR algorithm? -Pete (talk) 02:04, 2 July 2019 (UTC)

21:23, 1 July 2019 (UTC)

Pages at rear of work need re-aligning as the source file was updated. ShakespeareFan00 (talk) 19:04, 2 July 2019 (UTC)

If you ask for help, please have the courtesy give time to people to fix stuff before jumping back on it. Very confusing otherwise. Or if you are in a hurry, just fix it then.— Mpaa (talk) 23:21, 2 July 2019 (UTC)
I had waited... The bot hadn't apparently moved the last few pages, so I did them manually. Next time I should perhaps have left it longer? ShakespeareFan00 (talk) 00:07, 3 July 2019 (UTC)

Biographical Enquires (Railway Junction Diagrams on Commons).

A long shot but I thought I'd ask here, as the Wikisource community is typically good at finding really really obscure biogrpahical sources and details.

A while back someone uploaded a set of Railway Junction Diagrams to Commons (I'm not editing on Commons until 2021 currently.), but in doing some research at Archive.org on something else, additional information arose due to https://archive.org/details/1906RailwayClearingHouseMapEnglandAndWales/page/n5

That base map seems to have been drawn by a John Airey (?-?) , ( who in some other sources is also credited with the the Junction diagrams.) with the engraving by (J & W Emslie)

A lot of use of Google later and I found:

  • Emslie, John Philip (1839-1913 )
  • Emslie, W.R (?-?) (Brother to above?)

here - https://books.google.co.uk/books?id=05C02RhJZCkC&dq=W.R.+Emslie+engravers&source=gbs_navlinks_s being in "Benezit Dictionary of British Graphic Artists and Illustrators.", 2001, Oxford University Press.

I've not found any further details on W.R. Emslie, although context suggests by 1914, he and J.P had a company or partnership that did engravings.

I've also not yet found any dates for Airey in online sources.

If the 1914 diagrams (or any prior to 1923) can be more conclusively linked with Airey, and shown to be Public domain it would be very useful information for someone to add at Commons, (someone who might also be interested in uploading the map concerned). I seem to also recall that a later issue of the map (in 1948) (still in copyright, sadly) had an index of stations. ShakespeareFan00 (talk) 16:38, 1 July 2019 (UTC)

What is your question? What is it that you are wanting? — billinghurst sDrewth 23:27, 2 July 2019 (UTC)
If it is who is John Airey, looks to have been born in Stainton or Kendal, Westmorland c.1836, and looks to have died 1 Dec. 1924; buried 2 Dec 1924 (somewhat confirmed). Appears in census as clerk, or railway clerk, 1911 census says pensioner, clerk from RCH. Directories for London indicate that he appears to also have been running a map selling business from Euston station. — billinghurst sDrewth 00:00, 3 July 2019 (UTC)
Thanks , thats exactly the information I needed. ShakespeareFan00 (talk) 08:55, 3 July 2019 (UTC)

Classed tables...

I recently created {{table class}} and {{table class/import}} and an associated stylesheet....

However, in order to avoid name conflicts, I'd like to know if there are formatting classes that are defined elsewhere, ( Most likely in Mediawiki namespace), so that I can make a note of those names and NOT use them when implementing future table-class styles..

The reason for needing a seperate {{table class/import}} is because it's not currently possible to have template styles directly within a table owing to mediawiki limitations.

I used the two templates here -

ShakespeareFan00 (talk) 00:43, 4 July 2019 (UTC)

Why not just use class= in the table itself? —Beleg Tâl (talk) 03:02, 4 July 2019 (UTC)
The {{table class/import}} template is needed so the tweaked styles are "visible" to subsequent class calls (And it could be genericised further), but I am prepared to consider that {{tc}} may not be needed after all, other than as a 'placeholder' for the Stylesheet that sits below it.
The intent was that additional short codes can be implemented as CSS classes in the stylesheet, rather than needing to add them as part of the massive switch statment in {{ts}}. Although {tl|ts}} could in term be converted to use classes, (the separation of the short codes from the template logic would be a good thing.)

My other reason for having a {{tc}} is as a shorthand, and so that {{table class/doc}} can eventually document the new styles implemented. If there's a better soloution, I am open-minded. ShakespeareFan00 (talk) 08:51, 4 July 2019 (UTC)

In my opinion, if we are creating a central library of class styles, we should put them in a normal place like MediaWiki:Gadget-Site.css or something, and document them there. If we are creating work-specific class styles, we should put them in a work-specific template for each work. I don't think a template is the right location for a central library. Maybe you can set up {{tc/i}} similar to {{authority/link}} as a base structure upon which work-specific templates can be built? —Beleg Tâl (talk) 12:31, 4 July 2019 (UTC)
Speaking of which, I think MediaWiki:Gadget-Site.css is the answer to your original question, namely where are most of the prefab formatting classes defined. —Beleg Tâl (talk) 12:32, 4 July 2019 (UTC)
And others..
In respect of generalised classes:
  • The more that can be put in a core stylesheet the better. PROVIDED it can still be cleanly updated, one of the issues with using numerous {{ts}} calls is that when something changes, a lot of calls have to be updated or the massive switch statment in {{table style/parse}} modified when a new code is desired. Using TC instead means, the call and the stylesheet change, but the template logic doesn't need to be.
  • What about documentation ?

ShakespeareFan00 (talk) 13:40, 4 July 2019 (UTC)

(Aside1) - in MediaWiki:Gadget-enws-tweaks.css ClearFix seems to be defined twice? Unless I am not seeing a minor difference? ShakespeareFan00 (talk) 13:40, 4 July 2019 (UTC)
(Aside2) - Is there are tracking category for which templates are using Templatestyles?
(Aside3) - I can adjust {{tc/i}}, I can genericise it, and add a tracking category? Would there be a better name to make it fully generic, so it can be used on ANY element as opposed to just tables?

ShakespeareFan00 (talk) 13:40, 4 July 2019 (UTC)

If the classes are defined centrally in MediaWiki namespace, no importer template is needed. If the classes are defined per work, it occurs to me that the TemplateStyles tag itself is fully generic and can be used on any element. In either case, neither {{table class}} nor {{table class/import}} is actually needed. As for documentation, maybe Help:Table is a good place to start? —Beleg Tâl (talk) 15:00, 4 July 2019 (UTC)
If you want to move the CSS classes I defined into an appropriate mediawiki location feel free, Provided the documentation is also reclocated. :) ShakespeareFan00 (talk) 16:04, 4 July 2019 (UTC)

Sidenotes... The cause of the problem?

In page namespace I sometimes have overlapping sidenotes, a known long term issue.

This seems to be due to the way the sidenotes are implemented here https://en.wikisource.org/wiki/MediaWiki:Gadget-Site.css

The culprit being the "position: absolute." , and the sidenotes being 'span' based. An attempt was made to provide a badly concived fix here, Template:Right sidenote/sandbox.css but it was never fully implemented as I still don't think it would work properly..

Can some experienced review and come up with a solution?

ShakespeareFan00 (talk) 16:13, 4 July 2019 (UTC)

The 'fix' only partially works here - https://en.wikisource.org/wiki/Page:Ruffhead_-_The_Statutes_at_Large_-_vol_9.djvu/105, ignore the issue with the Drop initial, as that's an insoluble problem at the moment. ShakespeareFan00 (talk) 17:18, 4 July 2019 (UTC)
I got rid of the drop initial; there is no drop initial in the scan. —Beleg Tâl (talk) 20:12, 4 July 2019 (UTC)
Another side issue seems to be that it isn't possible to get a consistent view of how something MIGHT be rendered, as the way it's currently implemented means Page:'s, Mainspace pages, and User space tests will all give different renderings due to CSS class interactions, as will previewing content when what SHOULD nominally be the same rendering isn't because of various class interactions. It would be nice (and I have said this REPEATEDLY) to have consistency. It seems this is too much to expect from a volunteer project, given the time people have available, and it's perhaps time to ask the WMF to hire some coders to find a long-term solution to issues like this. ShakespeareFan00 (talk) 18:52, 4 July 2019 (UTC)
There are downsides to all approaches.
  • Positioning the sidenote absolutely allows the note to align with the margin. However, absolutely placed elements do not interact with each other at all and therefore can overlap.
  • Positioning the sidenote relatively requires the creation of a "pretend" margin to place the sidenote in. This pretend margin is actually a block of text that is positioned where we would expect the margin to be. However, if the actual margins are too large or too small, the sidenote will not align properly (and might overflow the margin). Also it will interact with all other relatively positioned elements (such as dropinitials) which can require a good handle on CSS to be able to resolve.
  • Experimenting with rendering is only half the battle. Even if it is the same in all namespaces, what about other layouts? other themes? how about mobile? how about epubs and pdfs? We ideally want to be able to say "this is a sidenote" and then let the system place it in a location that is appropriate for the situation.
To sum up: robust sidenotes that work in all situations are a bit more complicated than CSS can really handle, and a bit of overlap here and there is probably the best compromise achievable without having a professional design something (maybe a new mediawiki extension?) —Beleg Tâl (talk) 20:12, 4 July 2019 (UTC)
Page:The_Laws_of_the_Stannaries_of_Cornwall.djvu/34 was something that would work consistently...but it's overkill..

ShakespeareFan00 (talk) 22:32, 4 July 2019 (UTC)

Wikisource:News (en): July 2019 Edition

Well the source given in the textinfo doesn't actually match the text given in terms of the first few pages, So where did it come from?

A properly sourced scan would be better.ShakespeareFan00 (talk) 11:12, 5 July 2019 (UTC)

If you read further on the talk page, or the metadata in the text itself, you can see that it was proofread against a physical copy of the 1952 edition. Some of the content (the foreword and backplate for example) are probably copyvio. I would support replacing it with the 1916 edition that they linked to. —Beleg Tâl (talk) 12:07, 5 July 2019 (UTC)
Which metadata? The one at Archive.org? because I am only seeing the IP's comments on the talk page here. Listed at WS:CV until someone can give an unambiguous answer. ShakespeareFan00 (talk) 17:28, 5 July 2019 (UTC)
I assume BT is talking about the frontmatter in the text itself, which says "COPYRIGHT MCMLII BY FLEMING H. REVELL COMPANY", i.e. it's © 1952 (this frontmatter including the foreword is not in the IA scan, which is from 1916). What the IP seems to have done is copied the OCR text from IA of the 1916 edition, and then "proofread" it from a copy they have of the 1952 edition. —Nizolan (talk) 18:01, 5 July 2019 (UTC)
Yes, that is the metadata that I was referring to —Beleg Tâl (talk) 22:12, 5 July 2019 (UTC)
And there are better scans than the Google ones at Internet Archive. ShakespeareFan00 (talk) 17:38, 5 July 2019 (UTC)
I just had a look on the Stanford database and the copyright was renewed so this does appear to be a copyvio. —Nizolan (talk) 15:14, 5 July 2019 (UTC)

Which Benjamin Bell?

Index:A Treatise on the Diseases of the Bones.djvu

This list's a Benjamain Bell as the authour, but the dates for the grandfather or grandson of that name ( both notable in the surgical field) don't fit in with the nominal publication date given. Is there a third Benjamin I've not found information on, or is this a new edition of an earlier published work (possibly with updates)? ShakespeareFan00 (talk) 08:13, 5 July 2019 (UTC)

It is dedicated to w:William Adam of Blair Adam styled Lord Chief Commissioner of the Jury Court, which post he held 1815–1839. The foreword is dated 1st October 1828. Which means it is unlikely to be the grandfather or a reprinting of one of his older works.
The grandson lived 1810–1883 according to the ODNB, was also a surgeon and wrote a biography of his grandfather. HathiTrust has the author as "fl. 1823-1828" on no obvious authority. If 1810 is correct, the grandson would have published this when 18, which seems improbable on its own. The foreword also refers to various professors etc. as peers, or at least without the deference one would expect from a 18 year old, no matter who his grandfather was. The author also claims in the foreword that they have been studying "osseus tissue" for several years. I therefore think we can effectively exclude the grandson too.
It seems unlikely that two brothers would both be named "Benjamin Bell", so it's unlikely to be an elder brother of the grandson. But given the subject matter and so forth, it does seem fairly likely to be a decendant of the grandfather. The most likely identity then is as a son of the grandfather, and father or uncle to the grandson. My call would be to label this as a new author that flourished ca. 1828, and is almost certainly a son of the grandfather, and most likely the father or uncle of the grandson.
Note that there is no relation to Charles Bell (of Bell's Palsy fame). These are separate families from the Edinburgh region who both produced famous surgeons and medical writers, and who clashed over appointments and the internal politics of the Royal Society of Edinburgh and the Edinburgh College of Surgeons. But Charles Bell had a practice in London, and the name Benjamin is inherited through several generations of the other family (the grandfather, the grandson, the great-grandson, and there's even a great-great-grandson called Benjamin Bell). --Xover (talk) 11:13, 5 July 2019 (UTC)

The author is credited as a "Fellow of the Royal College of Surgeons of Edinburgh, and London", so certainly not the 18 year old. three people called "Benjamin Bell" are listed at [5], but there is some ambiguity in the writing [my comments in square brackets; let's call the original Benjamin I].

His [Benjamin I] descendants were also to achieve distinction in surgery. His elder son George Bell (1777 - 1832) and his [whose?] son Benjamin Bell were both Fellows of the College and surgeons in Edinburgh. His [whose?] younger son Joseph was also a Fellow of the College and his [whose] son Benjamin Bell was President of the College form [sic] 1863 - 1865. His son, Joseph Bell, great-grandson of that first Benjamin Bell, also became President of the College and famously the model for the character of Sherlock Holmes.

This source lists Benjamin I as having three sons: George (1777-1832), Robert (1782-1861), William (1783-1849), all FRSE, but no Joseph and no Benjamin.

So, possibly;

  • Benjamin (1749–1806)
    • George (1777 - 1832)
      • Benjamin II
        • Joseph
        • Benjamin III
          • Joseph II (1837-1911)

But this would make Joseph II the great-great grandson. It seems the answer may lie in this paywalled paper [6]. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:37, 6 July 2019 (UTC)

Removing non-content pages from a scan?

A user recently requested assistance with a work that ended up necessitating recreating the DjVu file from the original .jp2 files. The scan was a Wellcombe-sponsored IA scan, so it was high quality and with much care and attention to detail. In particular, the work has many plates, and each plate is protected by a rice-paper cover page, which the scanners lovingly captured: 17, 18.

Such pages are not part of the page number sequence, but make it hard to create the pagelist; and they contain no text to proofread or illustration to reproduce.

I ended up including them in the generated DjVu to be conservative (and to reduce complexity and finish faster, since someone else was waiting for the result), but the more I think about it the less I like it. So… do we have any existing guidance on this? What are others' practice when this comes up (for those that do create DjVus from individual images)? Or simply have an opinion?

I think, absent guidance to the contrary, next time I would opt to remove such pages. --Xover (talk) 07:42, 7 July 2019 (UTC)

yeah, i would be less concerned about gray "without text" pages, which can be skipped in transclusion, than the works missing images. i agree, it is a drag, creating the index though. and jp2 is a problem; we need some support. images are a key differentiator in our system, and the fact most scans do it clumsily means we will have some cleanup to do. Slowking4Rama's revenge 20:27, 7 July 2019 (UTC)

20:13, 8 July 2019 (UTC)

Proofread bar

Is there any template that can produce a colored bar based on the proofreading data of indices?! I want to use it inline. --Yousef (talk) 13:30, 9 July 2019 (UTC)

Not automatically as far as I know but there is {{PageStatus}}, used e.g. at Wikisource:WikiProject Chinese, which however needs to be updated manually. —Nizolan (talk) 14:22, 9 July 2019 (UTC)
yeah, for DNB it was done manually in a table Wikisource:WikiProject_DNB/Progress, but for 10th anniversary, there were charts generated from tables, Wikisource:Tenth Anniversary Contest/Summary. - Slowking4Rama's revenge 00:26, 10 July 2019 (UTC)

Create a 'small-poem' class

A number of works on Wikisource utilise a nested combination of {{block center}} and {{smaller block}} wrapped around a POEM tag.

(to give one example) Page:A_biographical_dictionary_of_eminent_Scotsmen,_vol_8.djvu/249

My proposal is that instead of making templates calls, the relevant (static) CSS code should be made a "__smallpoem" class, which could be imported via TemplateStyles or globally.

There are also many instances of nested /s /e pairs of the aforementioned templates. The second part of this proposal is a {{smallpoem/s}} consisting of a single opening DIV tag classed using the aformentioned "__smallpoem" style class. {{smallpoem/e}} would be a closing <DIV>

1 template call of 2, and in some instances just a POEM tag needed. More efficient? Thoughts? ShakespeareFan00 (talk) 10:14, 12 July 2019 (UTC)

My initial thought is that this proposal is merely an example of this, so I am hesitant to support it. Current solutions work well, and the template call maximum rarely gets hit by centered small poems. Does this really benefit anyone in any meaningful way? —Beleg Tâl (talk) 15:22, 12 July 2019 (UTC)
Also I really dislike the POEM extension (partially because it replaces the </p><p> that ought to exist between stanzas with a <br /><br />) so that might factor into my feelings on the matter. —Beleg Tâl (talk) 15:24, 12 July 2019 (UTC)
  Oppose Not supported. There is no practical definition of what a small poem is. We can define the extremes (one line vs an epic spread across multiple volumes), but we cannot point to a particular length and say "there is the boundary point." Beeswaxcandle (talk) 20:00, 12 July 2019 (UTC)
Small as in terms of {{smaller}} formatting, not in terms of length, but your oppose is noted. ShakespeareFan00 (talk) 20:33, 12 July 2019 (UTC)

Linterror hunting...

Does anyone here have a way of producing a report that lists an Index and all the Page:s in it that have a specified Linter concern?ShakespeareFan00 (talk)

I've been doing A LOT of cleanup manually, and would like to focus efforts on improving entire works per session rather than random improvements on a page per page basis?

Thanks in advance...

Oh and I am about to have a 'burnout' so may need to take some time out soon...

ShakespeareFan00 (talk) 20:51, 12 July 2019 (UTC)

They already answered your question:Wikisource:Scriptorium/Help#Enough_is_enough:_WHERE_is_the_mistake?_i.e_where_the's_flipped_formatting?.— Mpaa (talk) 22:16, 12 July 2019 (UTC)

Message Box module..

Here Template:From_Commons is calling the Message box module, but Lint hint generated a warning about an interupted SPAN (i.e A paired "Missing end tag" and "Stripped tag" warning.) . My guess it that the Message Box module is thinking it's going to get a SPAN based text parameter, whereas here it's getting a Block based list. Can someone with the approriate Lua skills examine the module and indicate if this is indeed what has caused the concern here?

ShakespeareFan00 (talk) 13:27, 13 July 2019 (UTC)

Also Help:Beginner's guide to Index: files ShakespeareFan00 (talk) 13:32, 13 July 2019 (UTC)

reader.library.cornell.edu

Does anyone have a tool that will fetch the work (or all the images individually) from pages like:

http://reader.library.cornell.edu/docviewer/digital?id=sea:111

please? (I've fetched that one manually, but that won't be feasible for larger works.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:35, 13 July 2019 (UTC)

It seems that the books in this online collection are stored in the form of sets of jpg images displayed using the BookReader script. The source code for BookReader is here. I have been hunting through Cornell's Digital Collections site and it really seems that they don't make their books available in any form except through BookReader, although you can purchase a POD copy(!) No doubt your best bet would be to e-mail them. Levana Taylor (talk) 21:14, 13 July 2019 (UTC)
They are stored as jpeg here: http://hydraprod.library.cornell.edu/fedora/objects/seapage:111_1/datastreams/digitalImage/content, http://hydraprod.library.cornell.edu/fedora/objects/seapage:111_2/datastreams/digitalImage/content etc. It should be possible to write a script to loop over and download the pages... MarkLSteadman (talk) 22:28, 13 July 2019 (UTC)

OCR gadget issue?

Is the OCR gadget failing for anyone else, or just me? It's not been working for ~24 hours - I've tried on different source documents more than once. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:51, 13 July 2019 (UTC)

Seems it works OK to me. --Jan Kameníček (talk) 19:02, 13 July 2019 (UTC)
it’s working for me. there is also the google OCR gadget which sometime gives better results. (in beta at bottom) - Slowking4Rama's revenge 00:09, 14 July 2019 (UTC)
Thank you, both. It's still not working for me; the editing area just goes grey. This is quite recent, as it worked previously. I've added the Google OCR gadget, and while that works, it adds words in the wrong location; see, for example, the initial edit on Page:Notes of a journey across the Isthmus of Krà.pdf/8. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:29, 14 July 2019 (UTC)
ok, for that work, i have the same problem. i suspect it is a problem with the text layer, and the open OCR software. maybe a ticket, but support is minimal. and yeah the OCR slices phrases. i.e. [12] Slowking4Rama's revenge 12:20, 15 July 2019 (UTC)
"open OCR software"? Are we at cross purposes? I gave an example of an issue caused by the Google OCR gadget. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:31, 15 July 2019 (UTC)
you said the OCR did not work - that is the open version. you said you added the google OCR gadget. people have been know to cut paste their OCR from a google doc as well. Slowking4Rama's revenge 02:26, 16 July 2019 (UTC)

The Lint error stuff

If this is a tangible issue could we have it stated as such in policy/guidelines and advice written up on how to avoid it? I don't want to waste people's time by having them follow me around correcting the errors, but beyond some basic stuff like avoiding nesting a div in a span I don't have much of a clue about the best practices, and I imagine many other editors won't either. In some cases the fixes seem to be contradictory to the advice that is currently available: for example in diff I used that workaround specifically because the documentation at {{gap}} states that it should not be used to format indents. —Nizolan (talk) 12:18, 14 July 2019 (UTC)

It helps if the person fixing can understand your code. (bright red embarassed face) ShakespeareFan00 (talk) 14:26, 14 July 2019 (UTC)
@ShakespeareFan00: Basically in that specific case I had to add {{divify}} because otherwise it would add <p></p> tags and unwanted space, and divify had to be nested inside the margin template to work. (I know it's overcomplicated, iirc in later projects I switched to a {{block center|...|width=95%}} hack to get an indented rh.) But anyway, I just don't want to create unnecessary work for you + others, so at least a list of common mistakes might be useful. —Nizolan (talk) 14:40, 14 July 2019 (UTC)
We say not to use {{gap}} for indents because we don't preserve paragraph indents. In this case {{rh|{{smaller|VOL. II.}}|{{smaller|D D}}}} would have been more than sufficient. Since it's not a paragraph indent (and since it's not being transcluded anyway), using {{gap}} would be acceptable here if the exact positioning of the mark is important to you. —Beleg Tâl (talk) 15:46, 14 July 2019 (UTC)
@Beleg Tâl: That's fine for this specific case, but it doesn't clear up the need for guidance on general issues, and it's also different from the documentation at {{gap}} (which I've now changed to reflect what seems to be common usage). The latter says that all usage to "produce a formatting preference" is deprecated, and specifically gives the comparable example of using it in {{right}}. It seems that there are multiple conflicting expectations here, on top of the Lint guidelines which are not written down anywhere at WS afaict. —Nizolan (talk) 16:58, 14 July 2019 (UTC)
While I'm complaining about unwritten expectations, I also want to note that this isn't the only place where people seem to have different ideas of what house style constitutes at the formatting level. So far I have seen signature marks being added and removed by different editors, claiming to follow guideline in both cases. I've seen the same thing with the nesting of {{lang}} and italics. Unless I am missing somewhere where these things are laid out explicitly, it might be a good idea with these small nuts-and-bolts problems to either work out some general guidance which people can stick to, or, if they are a matter of editor preference, to say so explicitly. —Nizolan (talk) 17:03, 14 July 2019 (UTC)
it’s great you want to document norms, i would suggest expanding Help:Beginner's guide to Wikisource with maybe some consensus building on discussion there. and think about a welcome wagon / teahouse (this is a renowned head down place, where the adversive admins only show up once in a while) Slowking4Rama's revenge 13:17, 15 July 2019 (UTC)
IMHO there is no best practice, just the awareness that templates cannot be arbitrarily combined to obtain the desired effect as they might produce incorrect HTML. So, if someone gets an error, they should go and look at the template implementation and the HTML error and figure out a different solution. "Keep it simple" is a general valid rule, as well as keeping a right balance in trying to faithfully replicate the original. — Mpaa (talk) 19:57, 15 July 2019 (UTC)

15:30, 15 July 2019 (UTC)

13:07, 22 July 2019 (UTC)

In the course of checking the page list, this turned out to be a entire volume of a periodical of which the current title of the Index is only a single issue. I think the file and Index should be retitled and the relevant pages relocated? ShakespeareFan00 (talk) 09:09, 14 July 2019 (UTC)

Moving pages and fixing transcluded pages is a PITA. Make notes on the file that it is a single file, and we can work around it. Even make a redirect at the best name and direct it to this file. This mistake can only occur once, later files will not make the same mistake. — billinghurst sDrewth 02:34, 24 July 2019 (UTC)

Isthmus of Krà

I've nearly completed work on Notes of a journey across the Isthmus of Krà. The original does not have a table of contents; would it be acceptable to add one, for the convenience of readers? If so, how should the fact that it is a new addition be indicated? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:52, 18 July 2019 (UTC)

there should be more help on this, but see also Template:Auxiliary Table of Contents -- Slowking4Rama's revenge 10:37, 18 July 2019 (UTC)
@Slowking4: Perfect, thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:49, 18 July 2019 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:01, 23 July 2019 (UTC)

tosection issue

I can't see why Quarterly journal of the Geological Society of London/9/179 is including the start of the following section, when I have used |tosection=End Jukes in my markup. Can someone debug it, please? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:53, 22 July 2019 (UTC)

@Pigsonthewing: See diff:9466667 and diff:9466668. |tosection=X doesn't mean "up until the section named X", it means "on the page given by |to=, include only section X". So on any page that contains content from more than one sequence to be transcluded (two+ chapters, two+ poems, two+ encyclopedia entries, etc.) each sequence has to be wrapped in <section> tags, for which the "##" syntax is just a shortcut that a javascript gadget transforms into start and end tags (see the diffs). --Xover (talk) 19:12, 22 July 2019 (UTC)
Noted, thanks Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:00, 23 July 2019 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:00, 23 July 2019 (UTC)

Undo move

I mistakenly moved:

Page:Quarterly journal of the Geological Society of London, volume 9.djvu/310

to:

Page:Quarterly Journal of the Geological Society of London, volume 9.djvu/310.

Please can someone reverse that?

(What intended to move, and later did, was Quarterly Journal of the Geological Society of London/9/179.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:41, 22 July 2019 (UTC)

  Donebillinghurst sDrewth 21:08, 22 July 2019 (UTC)
Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:00, 23 July 2019 (UTC)
This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:00, 23 July 2019 (UTC)

Future of the OCR gadget and its maintenance

The OCR button has recently been quite often out of service and so is also at the moment. From what Xover wrote at Scriptorium/Help I understood that the main problem is that its maintenance depends only on one person who has been inactive at Wikisource for quite a long time and so he is difficult to reach and his interventions are not very systematic. As a result contributors have to rely either on original OCR layers of DJVU or PDF files, which are often very poor, or on the Google OCR gadget, whose results are also considerably worse than those fo the original OCR gadget (the most annoying problems: it often moves parts of lines to a different place of the text and the contributor has to find the correct place and move it back; it usually does not recognize ends of paragraphs) and so it significantly slows the work down).

Therefore I think that some systematic solution needs to be found but I have no clue what. The tool is very important for Wikisource and when it breaks down, it should be repaired as quickly as possible, but Phabricator community has got extremely long solving times, if they solve the problem at all. Does anybody have any idea what to do? --Jan Kameníček (talk) 08:40, 23 July 2019 (UTC)

I'm asking if someone that has access to the Hebrew Wikisource, can do a check of this against the original there, as in the preamble notes there is a sentence that to me looks like it has letter missing in the translation here? ShakespeareFan00 (talk) 20:39, 17 July 2019 (UTC)

I can access the Hebrew Wikisource, but I can't read the Hebrew. Our text is checked against this version though, not against the Hebrew. Here is the requested passage:
הואיל ולמען תת תוקף להוראות סעיף 22 מספר הברית של חבר הלאומים הסכימו מעצמות ההסכמה הראשיות למסור לידי הממונה, שייבחר ע״י אותן מעצמות, את הנהלת ארץ ישראל שהייתה שייכת קודם לכן לממלכת תורכיה, בתוך אותם הגבולות אשר ייקבעו על ידן;

והואיל ומעצמות ההסכמה הראשיות הסכימו גם לכך שהממונה יהיה אחראי להגשמת ההצהרה שניתנה מלכתכילה ביום 2 בנובמבר, 1917 ע״י ממשלת הוד מלכותו ונתקבלה ע״י המעצמות הנ״ל, לטובת ייסוד בית לאומי בארץ ישראל לעם היהודי, בתנאי ברור שלא ייעשה כל דבר העלול לפגוע בזכויות האזרחיות והדתיות של העדות שאינן יהודיות, הקיימות בארץ ישראל, או בזכויותיהם ובמעמדם המדיני של היהודים בכל ארץ אחרת;

והואיל ומעצמות ההסכמה הראשיות בחרו בהוד מלכותו להיות הממונה על ארץ ישראל;

והואיל ויש להוד מלכותו סכמות וכח שיפוט בארץ ישראל בתוקף חוזה, שעבוד, מתנה, מנהג, הסכמה־שבשתיקה ובתוקף אמצעים חוקיים אחרים;

לפיכך נאות הוד מלכותו לצוות, בתוקף הסמכויות המסורות לו לתכלית זו בחוק השיפוט בארצות נכר, 1890, או בכל חוק אחר, ובעצת מועצתו הפרטית, ובזה מצווים לאמור:–
Beleg Tâl (talk) 01:20, 18 July 2019 (UTC)
Thanks, I don't read Hebrew either , but I'll see what Google Translate makes of it.
The sentence I actually had concerns is in small text below the above (noting an ammendment), and as of my enquiry read as follows on English Wikisource page:- "In Accordance with Articles 14 and 15 of the Law and Administration Ordinance every function formerly vested in the King of Britain or in the High Commissioner shall now be vested in the Cabinet of Israel and the term "Palestine (E"I)", whenever appearing in any law, shall no be read as "Israel".
The "no" should I think be read as "now"? The rest of the preamble matches up. I didn't want to change it without someone checking.ShakespeareFan00 (talk) 08:11, 18 July 2019 (UTC)
That paragraph does not appear in the source edition so it should be removed entirely. —Beleg Tâl (talk) 12:59, 18 July 2019 (UTC)
@ShakespeareFan00: Where is the original Hebrew? Also, the text is currently left aligned - it should be right aligned. — Ineuw (talk) 06:32, 26 July 2019 (UTC)
The Hebrew Wikisource he:דברי_המלך_במועצה_על_ארץ-ישראל which gives - https://main.knesset.gov.il/Activity/Legislation/Laws/Pages/LawPrimary.aspx?lawitemid=2000035 as the source for the 'in-force' version. ShakespeareFan00 (talk) 08:39, 26 July 2019 (UTC)

Several images used in this index (on file pages 20, 30, 31 and 49) were deleted from commons a few months ago. Posting here per the instructions at Category:Pages with missing files. * Pppery * it has begun... 15:50, 24 July 2019 (UTC)

Images were nominated for deletion by User:ShakespeareFan00 due to being fair use which is not allowed on Wikisource.
@ShakespeareFan00: looks like you forgot to clean up the transcription after you had the images deleted. —Beleg Tâl (talk) 17:01, 24 July 2019 (UTC)
keep and allow fair use here. Slowking4Rama's revenge 03:43, 29 July 2019 (UTC)

Typo words, LintErrors and other mistakes...

A while back I'd made a very incomplete list of some typo words I'd found whilst cleaning up OCR, User:ShakespeareFan00/Typo words

I was wondering if there were other contributors with related lists, which could be combined?

I would also strongly suggest that someone compile's a Help:Formatting howlers list, which also has the consensus solutions to some of the more commonly encountered Lint concerns that arise out of Special:LintErrors.

I've also sometimes found situations which Lint doesn't detect but which are still misformatting 'howlers', such as to give an example:

Mis-formatted '''The inner portion is supposed to be in italic.'' Not in '''bold.''''
Correct formatting {{'}}''The inner portion is supposed to be in italic.'' Not in '''bold'''{{'}}

Are there others? ShakespeareFan00 (talk) 11:06, 26 July 2019 (UTC)

I know Distributed proofreaders have a very comprehensive list. I would add all of the ligatures as 2 separate letters. I try to add idiosyncratic scannos to the Index page discussion list so that these could be checked by bot before transclusion e.g. persistent R as K, also names that recur misscanned. Zoeannl (talk) 09:23, 28 July 2019 (UTC)
My rule with ligatures is that ae, oe get transcribed directly if present in the original, fi , fl etc, I type as two separate letters. ShakespeareFan00 (talk) 18:07, 28 July 2019 (UTC)
Choice about whether or not to preserve a ligature can be affected by the language of the text. It's not something for which I would advocate automated replacement. --EncycloPetey (talk) 17:02, 29 July 2019 (UTC)

21:42, 29 July 2019 (UTC)

Finalizing the Mueller Report

The Report is close to finished; it needs only a bit more validation.

Note also that if you are adding links, there is now an epub version with many external references (but not, I think, the court cases) turned into hyperlinks. (on DPLA) Sj (talk) 12:30, 31 July 2019 (UTC)

A semi-sequel to the Mueller Report

The Senate dropped this today, part one of five: File:Report of the Select Committee on Intelligence United States Senate on Russian Active Measures Campaigns and Interference in the 2016 U.S. Election Volume 1.pdf. There was a concerted effort here to transcribe the Mueller Report and while I don't know that this is as timely and high-traffic, many users who were interested in that would be interested in this as well. —Justin (koavf)TCM 19:34, 25 July 2019 (UTC)

Little hack for even/odd page headers

Hello to everyone. I am sysop at Spanish Wikisource, and came here to present one little hack that we deviced there, and so far it works. It's Template EPI (Encabezado Par/Impar, Even/Odd Header), a template for using at the Index: namespace form, for helping with the variable running headers in odd/even pages. Right now it have to be used like this: {{ {{{|subst:}}}EPI|{{{pagenum}}}|Even header|Odd header}} , because of the way the software handles the "pagenum" parameter (it's only accesible during the loading of the page), but so far so good. I hope you take a look at it, and see if you can make any improvements. --Ninovolador (talk) 16:55, 29 July 2019 (UTC)

Can you provide some examples of works where the template has been used successfully? One of the potential problems I can see is that the page header for a page can be a {{RunningHeader}} with three or four parameters of its own, and that becomes tricky to code inside another template, especially when that template is subst'ed. --EncycloPetey (talk) 17:00, 29 July 2019 (UTC)
@Ninovolador: Thanks for the tip: I'm really glad to see such cooperation across the different language Wikisources! As it happens I threw together something similar a few weeks ago (must be in the zeitgeist) here: {{rvh}}. To the degree there's anything useful in it all, do, of course, feel free to grab the bits you want. --Xover (talk) 17:02, 29 July 2019 (UTC)
@EncycloPetey: right now I am using it in es:Índice:Las tinieblas y otros cuentos.djvu. Tbf is a fairly simple running header, and I am only using it there for testing reasons.
@Xover: Wow! your example is much simpler! Mine is more versatile, but your code is much cleaner. --Ninovolador (talk) 17:15, 29 July 2019 (UTC)
@Ninovolador: Well, as EncycloPetey points out, running headers can contain all sorts of complicated surprises, so {{rvh}} gains its simplicity by just not trying to handle anything complicated. So limited, but seems to work well for those simple cases. I should also point out that I got the idea from ShakespeareFan00 who, I believe, originally got the concept from Beleg Tâl (iow, I can take minimal credit for this). I've been meaning to look into whether it would be practical to make something more fancy that could be a general solution (like {{rh}} with built-in support for recto—verso alternation), but haven't gotten around to it yet. I imagine the big stumbling block would be chapter titles, as these change frequently and can't be automated (maybe if we throw some javascript into the mix?). --Xover (talk) 17:29, 29 July 2019 (UTC)
A handy tool I stole and modified from Phe years ago is User:Inductiveload/Running header.js. This adds a sidebar menu item to add a running header, and it takes the contents from 2 pages previously and updates the number (also handles roman numerals). It will go 4 pages back if it can't find anything 2 pages back (e.g. there's a chapter heading on that page). You still have to manually change the first page of a new chapter when the chapter name is in the RH. Inductiveloadtalk/contribs 17:22, 29 July 2019 (UTC)
Nothing to add on the technical side but @Inductiveload, @Xover, @Ninovolador: These tools are all hugely helpful, thanks. —Nizolan (talk) 19:34, 29 July 2019 (UTC)
This is all great, I echo Nizolan's thanks! Seems like it should all be documented somewhere central. I find that WS:HEADER is a redirect, and I'm not sure if the page it redirects to is the best place. I'm happy to try to write a short overview with links, but I'd appreciate guidance on where the best location is for that. Alternately, I could start it in user space and then ask around... -Pete (talk) 20:28, 29 July 2019 (UTC)
All this looks great. Ninovolador's template seems best to me, as it substitutes the text. Can the EPI template be founded here as well? --Jan Kameníček (talk) 16:17, 3 August 2019 (UTC)
Why not just use {{subst:rvh}}? —Beleg Tâl (talk) 20:37, 3 August 2019 (UTC)
@Beleg Tâl: Because there still remains redundant code text after substitution, see [23], while after substitution of the template EPI the code is clear, see [24] --Jan Kameníček (talk) 23:37, 3 August 2019 (UTC)
@Jan.Kamenicek: Hmm. To the degree one can use a word such as "design" for this without blushing, the choice to not subst the content of this template was by design. It is also not designed to be subst'ed at invocation.
However, looking at the page you give as example I'm a little confused. You have subst'ed it on the Index: page, leading to the template internal code showing up on the header of every page of the work. The intended usage is to put {{rvh|{{{pagenum}}}|achapter|awork}} in the Header field of the Index: page, which should produce {{rvh|42|achapter|awork}} in the header section of each page in the Page: namespace.
Is your goal to have only the resulting {{rh}} invocation show up on the page in the Page: namespace? I.e. {{rh|42|achapter|}} on even pages and {{rh||awork|43}} on odd? If so that was not the design goal of this template, but I'm sure we can create a parallel one specifically for this use case. There are so many moving parts involved here (subst: is relatively arcane to begin with, coupled with ProofreadPage sticking its nose in both providing variables and doing some subst'ing of its own, etc.) that I can't really promise anything in that regard, but at the very least I can't immediately see any reason why that wouldn't be doable; and most likely it'll simply be a matter of duplicating the approach Ninovolador took in es:Plantilla:EPI. --Xover (talk) 06:44, 4 August 2019 (UTC)
@Xover: Exactly, it would seem better to me if only {{rh|42|achapter|}} on even pages and {{rh||awork|43}} on odd pages remained. However, it is not a big issue so I did not want to bother you with reprogramming your template and thought it would be easier just to found and use the EPI template, and leave yours for contributors not familiar with substituting. Anyway, if you decide to change it, it would be good if the substitution was just an option, so that the template stayed friendly to less experienced users as well. --Jan Kameníček (talk) 07:25, 4 August 2019 (UTC)
@Jan.Kamenicek: I'll look into it. It should be possible to make {{rvh}} behave differently depending on whether it is substituted or not, but as mentioned… complicating factors. PS. Never hesitate to ask me for technical stuff like that. I'm happy to help when I can; I just make no promises regarding response times or ability to actually help. :) --Xover (talk) 08:00, 4 August 2019 (UTC)

This needs to be split up as it contains a number of issues not just March 1922 as claimed, or a scan found to replace it. What seems to be here is an OCR copy dump :( . ShakespeareFan00 (talk) 14:27, 22 July 2019 (UTC)

agreed —Beleg Tâl (talk) 20:16, 22 July 2019 (UTC)
https://archive.org/details/BetterEyesightByWIlliamHoratioBates/page/n266 is a scan of the complete "Better Eyesight" magazine, however it has been compiled into what seems to be an ebook and has 2 pages per page. Due to the inclusion of the advertisements at the front and back of each magazine, it is a bit bard to determine where one month's issue ends and another begins. --Einstein95 (talk) 04:23, 13 August 2019 (UTC)

A minor Charinsert modification suggested

I suggest to modify our MediaWiki:Gadget-charinsert-core.js by moving the 'User' row to be the first row, (which is the default row). I modified a copy in my user space CharInsert|HERE, and it works fine.

The reason being is that the browser cookies no longer store the user's preferred selection to be the default row. If there is a replacement mechanism, it's not working. I must re-select the 'User' row on every page being edited. This being the 23rd row, it defies logic. Ineuw (talk) 18:08, 14 July 2019 (UTC)

Tentative   Support though it would be better to have it remember selection per user (I use "insert" far more often than "user") —Beleg Tâl (talk) 15:20, 15 July 2019 (UTC)
@Ineuw: MediaWiki:Gadget-charinsert-core.js uses the mediawiki.storage API, which is a wrapper around HTML5 Web Storage. This works similarly, but not identically, to traditional cookies. The biggest difference in this case is that there is no good way to inspect what's in this storage in any of the big browsers that I've found (though I might be able to cook you up a custom script to inspect the charinsert cookie if you want it for debugging).
However, I just checked using the custom charinsert list from your vector.js and it seems to work just fine in the latest Safari, Chrome, Firefox, and Vivaldi. Is it possible that you've jacked up a security setting anywhere at some point? You don't have some kind of extra-paranoid antivirus running?
In any case, I don't have any strong opinion on the proposed change beyond it seemingly being prompted by what looks like a local issue; and that it will mean everyone without a preference set (including all non-logged-in users) will be presented with an empty list and a drop down labelled "User". I use charinsert so rarely that I don't really have a good grasp of the needs of those who do. -- Xover (talk) 15:36, 15 July 2019 (UTC)
@Xover: Thanks for the offer but I am really satisfied using the modified script of my namespace. The problem was that the new wrapper worked for awhile, but stopped working before the current wmf software update. Most likely because of the prior update. So, this solution is only good for those who are logged in and mostly use the "User" row. PS: I scanned for viruses an hour ago, but will re-boot and do a boot time scan as well. Ineuw (talk) 01:15, 16 July 2019 (UTC)
@Ineuw: It's also still working fine for me. It remembers whatever section I last chose. I believe that Xover was suggesting that you might be experiencing interference from a security or anti-virus program, not that you have a virus. After checking your security settings (to make sure they aren't restricting the use of Web Storage by Javascript), you might also want to clear out your local storage for Wikisource from the browser's Javascript console: localStorage.clear();. Hope that helps. Kaldari (talk) 01:12, 15 August 2019 (UTC)
Thanks for the warning. I've been focusing on what I might have done with the various antivirus etc. settings. — Ineuw (talk) 01:26, 15 August 2019 (UTC)
If it is failing, it is on a user basis, or a browser basis (which seems disproven). I am seeing no issues with Firefox with it remembering between pages or sessions. Probably worth testing on another browser, or another computer, to see if you can replicate. — billinghurst sDrewth 01:35, 15 August 2019 (UTC)
This section was archived on a request by: — billinghurst sDrewth 02:10, 15 August 2019 (UTC)

Bullets paragraph separator

Does anyone have any objection if I create a new template with bullets from a copy of {{***}}?

-- — Ineuw (talk) 15:53, 31 July 2019 (UTC)

Shouldn't be a problem, but why not use {{***|char=•}} ? —Beleg Tâl (talk) 17:40, 31 July 2019 (UTC)
In order to form an opinion on that I would need to understand what the new template's purpose was, and why the existing template doesn't do the work. I am generally of the opinion that we have too many obscure and poorly documented templates that are hard to maintain—and that this is something we should seek to improve, or at least avoid making worse—but I can't say whether that would be at all relevant to this specific case. *shrug* Is it possible to adapt an existing template to do what you want? --Xover (talk) 17:51, 31 July 2019 (UTC)
Thanks for the reminders. I was daft for not re-checking the documentation. P.S: This won't be my last "ingenious" proposal, so please bear with me and continue to point out the errors of my ways. — Ineuw (talk) 22:59, 3 August 2019 (UTC)
  Not done existing capability — billinghurst sDrewth 02:11, 15 August 2019 (UTC)
This section was archived on a request by: — billinghurst sDrewth 02:11, 15 August 2019 (UTC)

Smallrefs and tables over 2 pages

{{Smallrefs}} are not displayed correctly if there is a table spanning over two pages (they are displayed above the table instead at the bottom), see Page:Poet Lore, volume 31, 1920.pdf/489. Can it be corrected, please? --Jan Kameníček (talk) 20:49, 18 July 2019 (UTC)

  Done Whenever you have table syntax on the top line of the text editing area, you need to use {{nop}} so that the table syntax will stay on its own line instead of getting moved to flow after the contents of the previous page or section. —Beleg Tâl (talk) 21:11, 18 July 2019 (UTC)
I see, thanks for the help! --Jan Kameníček (talk) 21:17, 18 July 2019 (UTC)
Better that you use {{nopt}} as it is span-based which works better in tables. "nop" is div-based and has issues. — billinghurst sDrewth 02:14, 15 August 2019 (UTC)
Per some advice I was given a while back I've been advised to do multi-page tables like this...

On the first page: (Body):

{|
!Header line
|-
|Line 1

(Footer):

<!-- this goes in the footer -->
|}
{{smallrefs}}<!-- This MUST be on it's own line.

On the intermediate pages:

(Header)

{|
|-
!Header

(Body):

<!-- Continuation of table -->
|-
|Line 1 on page X

(Footer):

<!-- this goes in the footer -->
|}
{{smallrefs}}<!-- This MUST be on it's own line.

On the last page:

(Header):

{|
|-
!Header

(Body):

<!-- Completion of table -->
|-
|Line 1 on Page N
|}
The comments (and the subsequent line-feed HAVE to be included for it to work. (I've used this because it also allows some customisation of the comment used, (such as to note carried over figures to aid other proofreaders and validators.)
{{nopt}} or the approach above will also need to be considered if you have a multi-page table split into sections, which are not coincident with the page boundaries, getting such table right is straightforward if tiresome when building the transcluded versions.ShakespeareFan00 (talk) 06:14, 15 August 2019 (UTC)

The pages of this should be moved over to the consolidated version of the file under: Index:H.R. Rep. No. 94-1476 (1976).djvu , the destination pages being Page:H.R. Rep. No. 94-1476 (1976).djvu/1 to Page:H.R. Rep. No. 94-1476 (1976).djvu/368, respectively, All the pages of this have been validated, so a simple mass move/rename would be sufficient, (ideally done with a bot.) I already moved the Erratum. ShakespeareFan00 (talk) 16:35, 21 July 2019 (UTC)

Why would we want to move these, I don't see the point. The work is presumably okay, and has been transcluded, so not sure why we would want to move them, and then go and then have to go and check all the tranclusions. Sure it is not the presentation that we prefer, however, it isn't wrong. — billinghurst sDrewth 02:29, 24 July 2019 (UTC)
Okay then, if you think having the single pages is better, then the Erratum should be moved back, I can't do this directly as it would need an admin to ensure the history stayed intact. The consolidated file should then either have the pages duplicated (a bot task) or the consolidated file should be ditched as a direct duplicate. ShakespeareFan00 (talk) 08:51, 24 July 2019 (UTC)
The mainspace page (Copyright Law Revision (House Report No. 94-1476)) has some very complicated formatting for the three-column table that appears beginning at p. 186 of the source document, relying heavily on mw:Extension:Labeled Section Transclusion to assemble specified sections of each page into the correct column of the table. Although I generally like the idea of having all the pages collected in one file, the cleanup required to make the mainspace page render correctly afterwards would be nontrivial. It works in its current form. The “if it’s not broke, don’t fix it” principle applies. Tarmstro99 13:48, 21 August 2019 (UTC)

It seems that the parameter "original", which should include the name of the page that hosts the original language work with the interwiki link, does not work. --Jan Kameníček (talk) 21:36, 23 July 2019 (UTC)

It works for me, can you give examples? —Beleg Tâl (talk) 02:17, 24 July 2019 (UTC)
@Beleg Tâl: I am sorry, I have missed your reaction.
For example at Translation:Capsules only the translated title "Capsules" appears in the header. The original title "Cápsulas" does not appear anywhere, although it is written in the parameter "original=". --Jan Kameníček (talk) 20:59, 16 August 2019 (UTC)
Ah -- the "original" parameter is used to create the interwikilink to es:Cápsulas, but it doesn't display the value in the header. You can just use the "title" parameter for that, or put it in the header notes. —Beleg Tâl (talk) 21:30, 16 August 2019 (UTC)
I’m in agreement with Jan Kameníček that there ought to be a dedicated place in the header to display the title (and date IMO) of the original work. Levana Taylor (talk) 22:16, 16 August 2019 (UTC)
There is - it's "title" (and "year" for the date) —Beleg Tâl (talk) 22:59, 16 August 2019 (UTC)
"Title" is being used for the translated title, there isn’t a place for the original title. Levana Taylor (talk) 23:05, 16 August 2019 (UTC)
You can definitely put the original title in the title field if you want to. This is commonly done. See for example Translation:Ho, mia kor'Beleg Tâl (talk) 23:11, 16 August 2019 (UTC)
I am not a fan of combining two kinds of data in one field, it’s contrary to good design principles. Arrange to have them displayed on the same line if you like, but if they are separated into different parameters, the display can easily be rearranged. Levana Taylor (talk) 23:20, 16 August 2019 (UTC)
┌─────────────┘
It is very confusing for contributors (myself included) if there are two different principles applied in headers. For example the header created by the template {{translations}} treats it in a different (and more expectable) way (compare The Pitman. Imo there is no reason, why the attitude applied in creating the header by {{translation header}} should be different. --Jan Kameníček (talk) 04:46, 17 August 2019 (UTC)

I would just like to remind this unsolved issue with the confusing situation of different attitudes in the header forming templates {{translation header}} and {{translations}}? --Jan Kameníček (talk) 08:18, 24 August 2019 (UTC)

If you want to make a proposal for a change to {{translation header}}, go for it. —Beleg Tâl (talk) 13:23, 24 August 2019 (UTC)
I have considered this to be quite a clear proposal :-) Is there anything else that needs to be done to change it? I suppose no voting is required to repair a parameter of a template. --Jan Kameníček (talk) 16:36, 24 August 2019 (UTC)
Well actually, I am not quite sure what you’re proposing. Could you, maybe, do a mockup of what you’d like the translation header to look like? Levana Taylor (talk) 16:56, 24 August 2019 (UTC)

I propose to unify the behaviour of the parameter "original" of the template {{Translation header}} with the same parameter of the template {{Translations}}. That means: The original name of work written in the parameter "original" should also appear in the header in brackets.

Example: In page Translation:Capsules the template

{{translation header ... | original = Cápsulas ... }}

should produce the title Cápsulas (Capsules), similarly as it happens e.g. in The Pitman, where the template

{{translations ... | original = Kovkop ... }}

produces Kovkop (The Pitman). --Jan Kameníček (talk) 22:49, 24 August 2019 (UTC)

Revisiting curly quotes

Per EncycloPetey's suggestion at the style guide talk page, I would like to have the community revisit the idea of allowing curly quotes. Personally, I hate curly quotes and think they are a pox on humanity, but considering how we go to such great lengths to make our source texts faithful to the originals, it does seem like a prominent inconsistency. I'll try to list a few of the advantages and disadvantages that have been discussed...
Advantages

  • Can be more faithful to original text.
  • Consistent with Project Gutenberg (and probably the majority of commercial e-texts).
  • In some cases, may be easier to read, especially when multiple quote characters are in sequence.

Disadvantages

  • Harder to enter.
  • Some browsers may not be able to render (according to EncycloPetey in 2015).
  • They are often used incorrectly.

As this would be an optional style, I wouldn't give much weight to it being difficult to enter. It's certainly easier that most of our TOCs. And given that it's been 5 years since this was last debated, I have serious doubts that there are still issues around rendering. That basically leaves the objection that they are often used incorrectly, which I would say is also true of all the different dash characters we allow (and even expect). Are there other disadvantages that I'm forgetting? I suppose one is that it would cause inconsistent styles across Wikisource. What are other folks' opinions and thoughts about this? Kaldari (talk) 01:41, 2 July 2019 (UTC)

Curly quotes are better typography and are recommended by Unicode. Some quotation styles are not compatible with straight quotes (like „German quotations“). Browsers that can't handle basic web standards like Unicode are going to have far more serious problems on Wikisource than being able to render curly quotes. It is ridiculous that curly quotes are against our MOS. —Beleg Tâl (talk) 01:48, 2 July 2019 (UTC)
Re: "more faithful to original text" No, this is a typography issue, and has nothing to do with faithfulness to a text. A text has curly quotes because of a printer's choices, not any choice made by the author. And just as we do not specify fonts, we shouldn't bother specifying specific styles of punctuation either. Re: "Consistent with Project Gutenberg" this is irrelevant, and is at odds with the previous claim that using curly quotes would be "faithful to original text". If we're worried about the original text, then why should we care what other sites are doing? And if we're worried about what other sites are doing, then we're not caring about the original text. I'd also point out that Project Gutenberg is in no way consistent about the style of quotes they use. Additional disadvantage: We will need an additional series of quotation template to handle situations currently done with templates like {{' "}} which provide for clarity of punctuation. --EncycloPetey (talk) 03:32, 2 July 2019 (UTC)
It seems like we often try pretty hard to match the typography in the source: Page:KJV 1769 Oxford Edition, vol. 1.djvu/10. Should this be discouraged, in your opinion? Kaldari (talk) 04:02, 2 July 2019 (UTC)
RE: "we often", no this was a single editor over-formatting that page. --EncycloPetey (talk) 17:22, 6 July 2019 (UTC)
  •   Comment In the vast bulk of our works curly quotes are not required, and should not be encouraged, and their addition doesn't give value to the works, ie. disadvantages outweigh advantages, especially in true communal works. That said, we have always allowed some variation where there is a reasonable explanation of why a deviation from the style guide can be justified. We haven't been absolutionist about these matters, we just have a valid reasoning for setting a style guide, and generally asking people to follow it, and not deviate "just because", or "because I like them better". The test for a deviation has been an open conversation, and a semblance of consensus. — billinghurst sDrewth 05:32, 2 July 2019 (UTC)
In the vast majority of our works, neither casing, accents nor non-monospace fonts are required. We could go all old-school computer printout style, but given that we support Unicode and rich text, it seems reasonable that we write English as it is supposed to be written, with curly quotes. Their addition makes the text easier to read by adding additional cues as to the meaning of a quote, whether it is opening a quote or closing it. --Prosfilaes (talk) 07:14, 2 July 2019 (UTC)
I Would Çonsidér Accürate Reprǒductión Of Casing And Aççénts To Be Requĩred —Beleg Tâl (talk) 12:58, 2 July 2019 (UTC)
I THINK DECADES OF TELEGRAPHIC AND COMPUTER USE ESTABLISH THAT ACCURATE REPRODUCTION OF CASING IS NOT REQUIRED and a lack of accents in English has decades more of use with ASCII and normal keyboards.--Prosfilaes (talk) 09:32, 3 July 2019 (UTC)
I was considering opening up a discussion on this; it definitely is more faithful to the way English is published. Typographical norms should not be scorned for just being typographical norms.--Prosfilaes (talk) 07:14, 2 July 2019 (UTC)
  •   Comment I wouldn't mind them being used but I wouldn't use them personally, because it's another pain for us proofreaders to worry about that isn't worth the hassle. I wouldn't discourage other users from using them though, neither would I mind if a user were to add them in to a work I've proofread and used straight quotes on. In my opinion it is a minor typography issue and I really wouln't care if they're used or not. Jpez (talk) 11:45, 2 July 2019 (UTC)
  •   Comment The relevant current text of the Manual of Style reads:
"Use typewriter quotation marks (straight, not curly)."
I agree with what I think most above are saying, which I think would be most succinctly stated as follows:
"Any given work should be self-consistent in terms of the style of quotation marks and apostrophes used. That is, such marks should either all be curly, or all be straight, within any given work. If the initial transcriber of a work has chosen one style, the other style should not be adopted in that work unless a user intends to update the entire work to the new style choice."
I am very skeptical about this: "unless a user intends to update the entire work to the new style choice". Imagine occasional proofreaders doing small chunks of an encyclopedia. I doubt they will check what others have done and even more make it consistent. Pretty sure we will end up with mixed style except for committed users on single works.— Mpaa (talk) 23:30, 2 July 2019 (UTC)
@Mpaa: That's exactly the kind of thing I'm imagining (and have experienced). Here's how I imagine it working, with such a policy (perhaps worded a bit better) in place:
  • Me: Hi, I see you've validated about 5% of the pages of this work. That's awesome, thanks! I've proofread about 80% of them. I see that you have changed straight quotes to curly. Are you intending to go through the whole document and change them all to curly?
  1. Other editor: Yes, I plan to do that.
Me: Great! I look forward to seeing the final result.
  1. Other editor: No, I was just passing through, only interested in this one chapter of the book. I probably won't do more than a few more.
Me: Ah, I see. In that case, would you mind sticking with the convention I began with? (links to MOS) I'd rather stick with straight quotes than go to the trouble of updating all the pages.
In my view, that's a nice, easy way to resolve this "conflict" (which doesn't even need to be a conflict). Having a manual of style that guides us in this direction would, in my view, be a great advantage, and make it possible to quickly and easily arrive at an acceptable solution. -Pete (talk) 22:04, 3 July 2019 (UTC)
A satisfactory solution to an imaginary and likely situation, but that is not how it plays out in folk-lore. CYGNIS INSIGNIS 11:26, 6 July 2019 (UTC)
Hmm, I don't see anything relevant on that work's discussion pages, what am I missing? This is a type interaction I've seen work on many wikis over the years. I'm not sure where you'd anticipate it going off the rails...but, having a clearly articulated policy that sets the parameters is a necessary ingredient. -Pete (talk) 22:19, 6 July 2019 (UTC)
One practical concern is that various automated tools implement one style or the other, and may need to be rewritten or eschewed in order to comply with a change in the Manual of Style. -Pete (talk) 20:27, 2 July 2019 (UTC)
One of the reasons why I was doing this was because the OCR seems to spit out curly quotes, and I was tired of fixing them.
I’m guilty of using curly quotes in some works, where I am (or try to be) completely consistent throughout the whole work. That said, I normally only do it in novels (or this comment), where I’m planing on spending a fair bit of time sitting reading the thing — I think proper typography matters more then (and indeed, I’d say curly quotes, along with correctly-sized dashes and various other non-typewriter or -computer conventions are “proper”). For reference, non-prose, works, I think straight quotes are fine. (Maybe the distinction I’m getting at is between works with large amounts of dialog and those without?) I’ve seen people make the argument that they’re not required because we can make automatic replacements later, but that’s not really true: there are various situations in which it’s impossible to programmatically determine which type of quote character should be used. Anyway, I hope I’m not on the wrong side of common opinion here, but I do like to be able to use curly quotes on Wikisource. —Sam Wilson 11:33, 3 July 2019 (UTC)
I agree with this and Jpez's comment above. I don't see a good reason to forbid them categorically, and can see plausible use cases in novels and the like, even though I probably wouldn't use them myself (the projects I work on are generally academic and often require enough heavy lifting in Unicode without having to fiddle with quote marks). —Nizolan (talk) 14:30, 3 July 2019 (UTC)
On a personal level, I like curly quotes & find them easier to read. But from an editor's-usability standpoint it sure does make sense to convert everything to straight quotes, as the only way to avoid inconsistency. It is much easier to convert curly to straight than vice versa; I have a little application to straighten the quotes in all OCR output, but the reverse process would be no easy matter. I actually began entering Once a Week magazine with curly quotes but EncyloPetey pointed out the standards so I went through several hundred pages I'd entered and straightened the quotes. I am now up to 2000 pages and I am most emphatically not going to revisit all of them and curlify them .... We seem to be stuck with straight quotes as a legacy issue. There would have been ways to make entering curly quotes easier if they had been favored from the beginning; and the OCR output from the application that this site uses now gives us them automatically; but although I don't see a problem with allowing them in certain cases (there are, for a parallel example, a few texts displayed with long s's) a person would have to be urged to think carefully before they started down that route, because so much of the existing apparatus favors straight quotes that avoiding inconsistency would be difficult. Levana Taylor (talk) 23:08, 3 July 2019 (UTC)
It is much easier to convert curly to straight than vice versa is also an argument for curly quotes; transcribing text is all about recovering information from the images that can't be done automatically.
We should be stuck with nothing as legacy issues. If it's better we should make the change, and earlier is better. I'm not sure that much of the existing apparatus favors straight quotes, but this is a chance to change the existing apparatus.--Prosfilaes (talk) 01:07, 4 July 2019 (UTC)
Curly quotation marks are a legacy of printed blocks of text that required incremental spacing, denoting a beginning or end of a quotation if the squelching and stretching of the line made that ambiguous, and few of those legacies are transcribed here (often [or yet]). CYGNIS INSIGNIS 11:18, 6 July 2019 (UTC)
Quotation marks are a legacy of printing; inline quotation marks date no earlier than the 17th century. Printing pervades how we write English, and an attempt to abandon those legacies would produce something unusable or unwelcomed by most of our audience.--Prosfilaes (talk) 07:15, 7 July 2019 (UTC)
There is nothing difficult in writing curly quotes. On Macs and on all modern mobile devices they are easy to enter using the built-in keyboards. On Windows it's probably not built into the default keyboard, but that's what the Special characters button is for.
Old browsers are not a reason not to use modern technology. If they aren't upgraded today, they will be upgraded in a year or two.
A lot of websites that care about quality of presentation use curly quotes. Wikis in some languages have a gadget that converts straight quotes to elegant quotes automatically. Some sites where text can be edited do it as well, for example Quora, and Wikisource could do it (it must not be forced, though).
I actually find it surprising that there are people on English wiki sites who are against curly quotes, given that the English language has such a long typographic tradition of using rich punctuation, with quotes, dashes of various length, etc. --Amir E. Aharoni (talk) 05:47, 4 July 2019 (UTC)

(unindent) I have an idea; it'd take some programming, though. Suppose it was allowed to enter curly quotes, but the software would display them as straight quotes by default. That way, if only part of a text was entered curly, people would usually never notice because it'd all be displayed straight. However, there'd be a user-controlled setting allowing displaying curly quotes where they exist. Levana Taylor (talk) 02:12, 5 July 2019 (UTC)

I don't see the advantage. There's a lot of criticism of the "just add another user-controlled setting" idea in the UI world. It seems a lot better to offer tools to help make the changes and encourage not doing inconsistent changes.--Prosfilaes (talk) 03:46, 5 July 2019 (UTC)
I am also not sure it's a good idea to correct all curly quotes to typewriter quotes user-side by default. There are some cases in texts I've transcribed myself where curly quotes are necessary independently of the general guideline. An example of this is in transliterations of Semitic languages, where the distinct letters ayin and aleph (the half-rings ʿ and ʾ in modern scientific transcription) are often represented by curly apostrophes, ‘ and ’ respectively. In this case correcting these to typewriter quotes would remove necessary information. —Nizolan (talk) 11:51, 5 July 2019 (UTC)
Some additional advantages and disadvantages have been mentioned:
More Advantages:
  • Easier to convert from curly quotes to straight quotes than vice versa.
  • OCR tools already output curly quotes.
More Disadvantages:
  • Some new templates will be needed.
  • Some tools will need to be updated.
Let me know if I'm overlooking any. Kaldari (talk) 04:53, 6 July 2019 (UTC)
  • @EncycloPetey: I'm curious if any of the arguments above have led you to reconsider your opposition (as you seem to be the main opponent of the idea). Kaldari (talk) 04:54, 6 July 2019 (UTC)
    I may be more vocal, but that doesn't mean I'm the "main opponent", it merely means that my voice is stronger in this discussion. It is normal in the Wikisource community for long-time participants to sit back and read discussions without chiming in, so long as their opinion has been expressed by someone in the discussion. I have done this myself. No one in this discussion has explicitly voiced support or oppose, and it would be premature to interpret anyone's opinion when there has been no call for a vote. You've also biased your interpretation: where some editors have said "I don't care", you have interpreted that as "support", but it is not at all the same thing. This is a "community revisit", and not a vote to change policy. --EncycloPetey (talk) 17:32, 6 July 2019 (UTC)
    For the record, I explicitly   Support changing the policy to allow curly quotes at the editor's discretion, and   Oppose continuing to disallow curly quotes. However, this discussion didn't contain a proposal either way, so it doesn't matter until someone posts a proposal for !voting. —Beleg Tâl (talk) 23:19, 6 July 2019 (UTC)
  • I think Kaldari's summary is helpful, and I'm not sure why we're talking about whether this is a formal vote when nobody has claimed that it is. FWIW I have no objection to the characterization of my position. I like Jan's version below, making "straight" the default and only permitting “curly” where there's some evidence that curly will be used consistently throughout the work. In fact, that seems like a useful formalization of a principle I expect is already in use in some places, but not formally documented or endorsed. Which IMO is one of the best ways to develop policies on a wiki. -Pete (talk) 22:29, 6 July 2019 (UTC)
To make my position clear, I am very much in favor of allowing curly quotes on a case-by-case basis as long as the editor intends to make a good-faith effort to see they're used consistently (the guideline could be, "please don't add curly quotes to a work that's already partly straight quotes unless you're about to change the whole thing.") Though I worry about how to get it done, I think the problems are solvable, so yeah, in favor. Levana Taylor (talk) 03:20, 7 July 2019 (UTC)
@EncycloPetey: I wasn't trying to hold a vote, I was trying to see if maybe there was consensus for a change in the style guide, in which case, there would be no need for a vote. It is clear from your reply, however, that you are still against the idea, and maybe there are other silent voices that are as well. Also, please let me know whose opinion I have misinterpreted, and I will be happy to revise my statement above. Kaldari (talk) 14:43, 8 July 2019 (UTC)
And by "consensus", I meant actual consensus, not wiki-speak consensus. Kaldari (talk) 14:47, 8 July 2019 (UTC)

I have been following the discussion and thinking over the pros and cons and finally came to this opinion: The main disadvantage of allowing curly quotes is a danger of different attitudes of two or more people transcribing one work. For this reason I would explicitely allow curly quotes only if the contributor is able to ensure consistency of their usage throughout the whole work, typically when the contributor transcribes the whole work by himself/herself. When more people cooperate on transcription of a work, straight quotes should be recommended, unless they are all able to make an agreement about curly quotes (of course such agreement is practically impossible with such large works as Encyclopaedia Britannica). --Jan Kameníček (talk) 18:53, 6 July 2019 (UTC)

That's a nice thought, but even with the best will in the world, people start projects and don't finish them. The better thing is for the person who starts a project to document all their style choices on the talk page -- the note at the top of the index indicating that style guidelines exist is a great invention. Quote-style is no different from many other choices in that respect and can be handled the same way. If we shift to curly quotes and they become normal, then there will be no problem with expecting people who sign on late to a future project to use them. It's only the possibility of transitioning to curly quotes in projects that are already begun now that presents difficulties. Levana Taylor (talk) 21:03, 6 July 2019 (UTC)
A policy doesn't have to be perfect to be useful, sometimes "good enough" is good enough. I believe most Wikisource users would answer in good faith if asked, "do you intend to complete this project?" For myself, I think I'd answer "yes" for about half, "no" for about half. If a few projects end up with some inconsistencies because somebody intended to finish it but then got distracted or busy elsewhere, is that too high a price to pay if there are benefits elsewhere? -Pete (talk) 22:33, 6 July 2019 (UTC)
Both sides of this argument are starting to fall into the desperate position of trying to shore up numbers of supporters by appealing to the everybody who is reading this and not making their position clear must by inference be on my side of the argument. So for the sake of this alone I must delurk and declare that I am a closet supporter of the use of curly quotes for reasons I shall not go into here as they have already been adequately covered by others.
On the matter of automated tool use favouring straight quotes I have some sympathy. Creative laziness is always admirable but at heart it is just that: laziness. If some piece of scripted magic could perform reliable verification then why is everybody here at all? Proofreading still has an aspect of bespoke craft and we should take pride in our input.
As for perceived difficulty of entry of characters, aside from resort to UNICODE, there are the various pickers available both mediawiki supported and local. Nobody appears to have yet noted that native HTML such as the <q></q> construct works well under mediawiki, and all reputable browsers now handle the so-called HTML entity-forms: &ldquo; → “; &rdquo; → ” &lsquo; → ‘; &rsquo; → ’ Learn them; they are your friends! 114.78.66.82 22:56, 6 July 2019 (UTC)
Suggestion for implementation of quotes: there should be a gadget that finds all straight quotes on a page and converts them to curly while highlighting them so the editor can check the result for correctness (because it wouldn't be perfect). This would go a long way toward easing worries about inconsistency. Not only would it allow quicker, better conversion of works that have been begun using straight quotes, but if someone happens to notice stray straight quotes in a work that's mostly curly, they can rapidly find them all. Levana Taylor (talk) 03:20, 7 July 2019 (UTC)
  Comment I agree with the spirit of @Levana Taylor's suggestion but point out the occasional real-world case of quotations crossing pages — commencing on one page and terminating on a later one — would completely reverse the sense of correct quotation mark appearance. Which makes implementation of such a gadget tortuously impractical — as the analysis must take place at the work/chapter level to enable sensible decision for the gadget to act on the component page level. Only the {{nop}}-inserter gadget attempts this at present and for a vastly simpler case. 114.78.66.82 03:43, 7 July 2019 (UTC)
I don't see the problem. Ending quotes are at the end of words, lines and paragraphs, and starting quotes are at the start of words, lines and paragraphs. Quotes never cross paragraphs in normal English style, so any tool should restart at a new paragraph. On proofed text, it's possible the tool will put the wrong quotes on the first paragraph, but it won't be a problem for the whole page.--Prosfilaes (talk) 07:02, 7 July 2019 (UTC)


Arbitrary break (curly quotes)

I have… questions… 🤔


  • Are we proposing to allow straight vs. non-straight quote mark style to be at the whim of the first contributor? Of any contributor that has a good-faith intent to update all previously proofread pages? Only when based on some set of criteria related to the work? What, if any, are the constraints on the choice?
  • Is the proposal for the benefit of proofreaders with a preference, or for our readers? That is, is our goal to achieve the best presentation for our readers, or to allow our proofreaders some flexibility or to express their own preference? What are we trying to achieve by making a change in this area?
  • At what level do we care about consistency? The work? The chapter? Individual entries in the DNB and similar? Across works within a series? How would we achieve such consistency in practice? How would we resolve conflicts in preference?
  • What kind of curly quotes (there are on the order of 30 of them) would be allowed, and how would the style be decided? Would the “Anglophone” style be allowed or preferred for a work by a « Francophone » or „Germanic“ author? How about for reproducing an official text of some kind (English translation of a law, say) where the originating country specifies (sometimes by law) a specific quote style?
  • How would we handle the issues currently dealt with by {{" '}}, {{' "}}, et al (there are 5 of them just for straight quotes; each extra style of quote would generate at least 5 more)? How do we detect and correct instances where accent marks are (accidentally) substituted for single quote marks? How about the inevitable Windows CP1252 character set issues?

I don't as yet have a firm opinion on the issue of curly quotes except that they do create a lot of complexity and that that complexity must be addressed if we are to adopt them. I do hold the opinion that good typography aids readability; that good typography creates visually appealing works, and that visual appeal is a desirable trait; that our goal should be the benefit of our readers over our own contributors; and that our readers are a diverse group with many different needs. I am also by inclination prone to prefer more diplomatic reproduction of works (I've driven certain community members to distraction by insisting on using {{lgst}})—which inclines me to want to reproduce a work's quotation style, and against any style that differs from the one use in the work (including substituting straight for curly)—but experience has taught me that there are good reasons to moderate that impulse (see, Billinghurst, I do listen and learn!).

And, ultimately my main concern is maintainability and manageability over the totality of the project, over a decade or two, and in the face of practical realities like the occasional conflict between contributors, the perennial slow changing of the guard (who now remembers why we made every decision shaping the project?), and the necessity of either automating or having the manpower for certain kinds of necessary cleanup or guidance for new contributors.

I like fancy quotes (and other typographic affordances), but they sound like they'd be really hard to do right. --Xover (talk) 07:18, 7 July 2019 (UTC)

Curly quotes should match the scanned text. That's simple.
We should deal with {{" '}} with fire, and then dump the ashes into a heart of a live volcano. I mean, if that's your bag, then whatever, but it seems weird about arguing for curly quotes against claims that the typography doesn't matter and that consistency is important, but have to deal with an idiosyncratic set of templates that surely have no consistency in use and tackle an issue of micro-typography that can and should be handled by modern text layout systems in web browsers; TrueType fonts have supported kerning pairs of characters for 25 years.
I would hope that modern systems won't dump CP1252 characters into the browser. We should have a bot checking the pages for inappropriate Unicode characters (Private Use, unmapped, etc.) and include 0080-009F in there.--Prosfilaes (talk) 07:56, 7 July 2019 (UTC)
I have not observed that the kerning issues with adjacent quote marks (or with "rn" and "m", for that matter) have been rendered moot by modern text layout systems. I have no particular affinity for those templates, but I do care about the issue they are attempting to solve. I also do not think making templates to deal with this issue that are consistent (what are the problems with the existing ones?) should be impossible. --Xover (talk) 08:06, 7 July 2019 (UTC)
Modern text layout systems are well-capable of dealing with kerning. If they don't, well, that's still stepping into their territory. There's no way of making such templates consistently used unless we make a big fuss about them, which I strongly object to. I haven't seen them in any thing I've used text I've worked on.--Prosfilaes (talk) 08:36, 7 July 2019 (UTC)
That they have advanced support for kerning (which they do, even on Linux) does not mean they can intuit the need for such automatically: we need to make use of such facilities for anything to happen. --Xover (talk) 08:52, 7 July 2019 (UTC)
In a TrueType font, there is a table listing pairs of characters and the space that needs to be added or removed between them. If "' needs extra space, then that table should list the amount of extra space it needs and the typesetting program should adjust the distance between the two glyphs appropriately. See w:Kerning.--Prosfilaes (talk) 12:31, 7 July 2019 (UTC)
Yes, that is indeed how the TrueType specification handles it; and OpenType has even more advanced features for this. However, so far as I know, no web browser on any operating system does this automatically, and the CSS features for explicitly enabling it are only partly supported (and would require some sort of markup on our side in any case). And even with that support it would require an OpenType font which has the appropriate kerning setting for these pairs, and that was available for us to use, which mythical beast may exist but I couldn't name one off the top of my head. I agree that it would be very nice if we didn't have to worry about this, but, again so far as I know, that is not actually the world we live in. If you know otherwise I'd be happy to get rid of those templates even if we only use straight quotes: they're nobody's idea of a perfect solution, they're just the best we've got available.--Xover (talk) 15:22, 7 July 2019 (UTC)
I have never used these templates; "'this'" or “‘this’” have always been completely adequate. If/when browser support for kerning becomes more common, then the appearance will be improved slightly, but in the meantime there is no need for manually padding the punctuation imho. —Beleg Tâl (talk) 20:11, 7 July 2019 (UTC)
I'd say pretty much this is the definition of a problem we don't have to deal with. We send "' down the line, and the other side renders it. If that rendering of a common pair of characters is unsatisfactory, the spirit of HTML and text transcription is that we don't know their fonts, their system. If systems don't do this automatically, and fonts don't set kerning for these pairs, then obviously it's not considered a big problem.--Prosfilaes (talk) 21:42, 7 July 2019 (UTC)
That's a fair stance (both of you). I don't agree with it—it goes to readability so a similar argument applies to this as to using typographers quote marks (or distinguishing between plain, en-, or em-dashes)—but it is absolutely an issue it is reasonable to consider falling within the limits of "good enough". Thus the question, above, of whether what we are trying to achieve in this discussion is flexibility for our contributors or a better reading experience for our readers. What the goal is affects the calculous of how much effort to put into stuff like getting typography and layout correct vs. getting it "good enough" (wherever you draw that line in general) and letting the users' browsers deal with it. --Xover (talk) 04:31, 8 July 2019 (UTC)
I think the difference is here that users' browsers can't add curly quotes properly; it is up to human intelligence to add them. On the other hand, we can't add space properly, given that we don't know what fonts are being used. In the long run, properly recording the characters that are there will help all usages of the text, where manually kerning characters will help only certain users, and make usages that don't reflect current web browsers more complex.--Prosfilaes (talk) 11:24, 8 July 2019 (UTC)
I use the {{" '}} group in my projects after seeing them in use elsewhere since it seems like a neat solution to the problem, and the templates can easily be made inoperative whenever browsers do get round to displaying them better. Given that I regularly have to add manual padding when typesetting documents in InDesign using professional typefaces I am somewhat sceptical about how effectively those TrueType and OpenType specifications are being used by fonts at the moment. As Xover said above fonts using the full scope of these settings properly are a bit of a mythical beast.
For the record, in Arial on Windows 10 and the current version of Chrome I can't distinguish "' from '" at all (or for that matter '''). In curly quotes, though, it seems much easier to distinguish “‘ and ‘“, so it's possible that the templates are simply unneeded in that case. —Nizolan (talk) 15:15, 8 July 2019 (UTC)
Back to topic of curly quotes, I checked the French, Italian, and German Wikisources and they all seem to use curly quotes by default. (It seems Spanish has their own style of quotation marks and doesn't use apostrophes.) Kaldari (talk) 14:35, 9 July 2019 (UTC)
@Kaldari: It looks to me like there's enough consensus in the discussion above to justify a more formal proposal. Do you have thoughts of putting something together? -Pete (talk) 20:57, 24 July 2019 (UTC)
@Peteforsyth: Unfortunately, I'm still somewhat of a newbie on Wikisource, so I'm not familiar with community practices here. How do these things work? Do you call a vote? Write an RFC? Honestly, I would love it if a more experienced Wikisource editor took over the process from here. Kaldari (talk) 22:10, 24 July 2019 (UTC)
@Kaldari: French uses the same style quotes as Spanish, namely« », so I'm not sure how you determined that they use curly quotes. Italian Wikisource is inconsistent, and they currently have about six active contributors, but Italian also has a different set of quotes, namely “ „ , than English does, so they (along with Spanish and French) are not a good basis for comparison. --EncycloPetey (talk) 01:49, 26 July 2019 (UTC)
@EncycloPetey: True, but the French, German, and Italian Wikisources seem to use the curly apostrophe pretty consistently. My point is, our use of straight quotes seems to be the exception, not the rule. Kaldari (talk) 17:44, 26 July 2019 (UTC)
@Kaldari: I still don't know how you came to that conclusion, because I looked at the same sites and didn't reach that conclusion. You should also consider that the apostrophe in languages like French and Czech can be coded differently. For example, there is a single Unicode character available for French ľ that is typically used, and French style is to prefer that over using separate characters. That being the case, it is inappropriate to make comparison because English uses no such character. --EncycloPetey (talk) 23:36, 26 July 2019 (UTC)
When comparing other projects, Czech Wikisource and also Czech Wikipedia use curly quotes, though a „different type“. "Straight quotes" are just tolerated but not recommended. --Jan Kameníček (talk) 18:43, 26 July 2019 (UTC)
  • As the consensus appears to be against the sole restriction on typewriter quotes (",") against “curly” quotes (“,”), I have edited the appropriate section of the style guide to reflect this discussion. If any one wishes to revise the wording, that would be more appropriate. TE(æ)A,ea. (talk) 21:18, 25 July 2019 (UTC).
    I disagree that such a conclusion has been reached that there should be change to the style guide, and have undone the change. — billinghurst sDrewth 22:10, 25 July 2019 (UTC)
    • Why do you disagree? There is no one who reasonably supports the current guideline; there is only a difference of opinion on the proper wording and implementation. TE(æ)A,ea. (talk) 23:33, 25 July 2019 (UTC).
      Did you read my sentence up to the comma? Could it be any more specific? PS. Don't flip me any metaphorical bird, I have done my time here, and earned my rights the hard way, and supported them with work the whole way through my many years here. — billinghurst sDrewth 04:42, 26 July 2019 (UTC)
    • Did you read mine? The reasoning you provided earlier aligns with my change to the style guide, That there should be consistency within a work preferably to authoritarian commands without reasoning. TE(æ)A,ea. (talk) 12:39, 26 July 2019 (UTC).
      "no one reasonably supports" is a way to wave your hand and dismiss any counterarguments, and suggests there is not a consensus. The last post before you made your comment was a move to write an explicit proposal and vote. It is therefore inappropriate to circumvent that process by offhandedly dismissing the whole thing on a whim. --EncycloPetey (talk) 01:49, 26 July 2019 (UTC)

Oh my, seems there's still some tension around here. I'd suggest there's no benefit to implementing a change before everybody involved has had the chance to review it an comment on it. I'm happy to prepare a more specific proposal, based on the gargantuan input in this section and that above, as I suggested earlier; until then, I'm not sure there's any pressing need to make changes to the style guide. That said, I appreciate @TE(æ)A,ea.: making the effort, and it might be useful to consult Special:Diff/9483059 in building the proposal. If anybody wants to propose alternative wording, feel free to do so here or on my talk page, and I'll do my best to incorporate it. -Pete (talk) 00:10, 27 July 2019 (UTC)

The proposal should have three options: 1. Keep the style guide as is (straight quotes only) 2. Write style guidelines that consider curly quotes to be the normal, standard usage 3. Allow curly quotes but only under certain circumstances or with certain restrictions, or do a gradual implementation of option 2, or a test implementation of it, or..... This option is a catchall meaning "not 1 or 2" and if we choose it we'll be starting another round of discussions leading to another proposal. Levana Taylor (talk) 00:32, 27 July 2019 (UTC)
@Peteforsyth: I would very much appreciate a proposed guideline that addresses directly (answers) the questions above, as opposed to a more general "Yes (figure out details later)" vs. "No" approach. On this issue I find myself leaning towards being conservative (perhaps overly so) and preserving the status quo when not all consequences are clear. I therefore think it would be best to ask this question in the form of a fully-formed assertion: "this is what the guideline should be, do you agree or disagree?". If a significant number disagree we revise the proposed guideline to address the concerns raised and try again. PS. I'm happy to help out here, so never hesitate to ask or ping me, but I don't think I'll actually be of any help as I am too uncertain and conflicted on this issue. --Xover (talk) 10:49, 27 July 2019 (UTC)

Break (curly quotes) for discussing changes to guideline

  1. Use typewriter quotation marks ("straight," not “curly”). Status quo, unchanged from original.
  2. Use the same quotation marks (either "straight" or “curly”) in a work, but not both. This, possibly with rewording, was my suggestion; this proposal favors standardization within a work, but not a universal idea.
  3. Use the same quotation marks as the work presents; i. e., if the work shows "straight" quotation marks, use those, or if it shows “curly” quotation marks, use those instead. This would likely be the most simple proposal, as there would be no need to change what the original source shows.
  4. It is allowed to use either straight quotation marks or the same kind of quotation marks as the original work presents, with the following condition: Other than straight quotations marks are permitted only if it can be ensured that they will be used consistently in the whole work. If such consistency cannot be ensured (e. g. because of a large number of contributors to the work or because of disagreement of some of them), straight quotation marks are recommended.
  5. The use of curly quotes is encouraged, and considered to be the default style in all texts, with or without a source. In cases where the source text uses a different typography (guillemets « », typewriter quotes, or whatever) that style can be used (encouraged, but not required). They are the standard form of “ ” ‘ ’ and " '. The use of typewriter quotes in texts where the original doesn't have them should be considered deprecated, to be phased out.

If any one else has an idea, write it out above; however, if you wish to reword an above proposal, please mark that as indicated above. I support either no. 2 or no. 3. TE(æ)A,ea. (talk) 12:33, 27 July 2019 (UTC).

I have added one more point. --Jan Kameníček (talk) 13:13, 27 July 2019 (UTC)
  • I have changed no. 4 to a sub-section of no. 2, as it appears to be quite similar to no. 2. TE(æ)A,ea. (talk) 21:08, 27 July 2019 (UTC).
    @TE(æ)A,ea.: Seems to me that numbering it as "1." again is quite confusing for discussions. What is more, it is very different from no. 2 because no. 2 lets the contributors choose only between two kinds of quotes, and also lets them choose the curly quotes even if they are not used in the original. My suggestion a) lets them choose any kind of quotes, but b) only if they are used in the original and c) not in complex works. --Jan Kameníček (talk) 06:12, 28 July 2019 (UTC)
    I think it's simpler to leave it as Jan wrote it to ensure that each proposal has its own number, and I've reverted it to that accordingly. —Nizolan (talk) 12:53, 28 July 2019 (UTC)
    No. 4 revised after discussion below, to make it more clear. --Jan Kameníček (talk) 05:11, 30 July 2019 (UTC)
I support #2, [edit: #4 per Jan's rewording below,] or, failing that, retaining the status quo per #1: be consistent but do not enforce curly quotes. I strongly disagree with #3 being the "simplest proposal"; given that the vast majority of texts here are typeset using curly quotes this is effectively the same as enforcing their usage and would sum to a potentially enormous amount of extra work. Many OCR programmes, such as Google's, do not automatically add curly quotes so this would need to be done manually. I don't see the point of #4.Nizolan (talk) 17:54, 27 July 2019 (UTC)
To explain the point: I believe that contributors should be allowed to use any quotes that are presented in a work (e. g. curly), but they should not be forced to it, and should be allowed to use straight quotes only, if they prefer so. (this is ensured by no. 2, but not by no. 3). On the other hand I do not consider it good to allow to use curly quotes if there are straight quotes in the original (this is ensured by no. 3, but not by no. 2). Therefore I suggested the new point, which ensures this + recommends how to deal with complex works where contributors are not able to reach consensus which quotes to use. Number 2 gives the contributors the right of choice, but does not deal with the situation when the contributors have different opinions. Thinking about it again, it can happen not only with large works, but with any work where 2 or more people contribute, so I am rewording the point for "... If two or more people contribute to one work and are not able to reach consensus what kind of quotation marks to use, straight quotation marks are recommended." --Jan Kameníček (talk) 15:31, 28 July 2019 (UTC)
@Jan.Kamenicek: I see your point regarding curly quotes where typewriter quotes are used in the original. I would say as currently worded #4 is problematic for a number of reasons, though. As worded, it seems to forbid using straight quotes unless they are present in the original, as per #3. I'm not sure that works that are internally inconsistent are enough of a problem to need special mention, but as currently worded the suggestion doesn't say anything about what to do when they are. The last point seems over-complicated, imo: Wikipedia's "don't mess around with an article's established formatting preferences" seems like the most straightforward way to prevent these disputes.
I would personally firm it up along these lines: "1. Curly quotes may be used if and only if the work being typeset consistently uses curly quotes. 2. The marks used in proofread material should be consistent. Contributors to projects that have already started should ensure that they follow whatever convention has already been adopted in the work." And perhaps a final clarification for exceptional cases where curly or straight quotes are required in a particular instance regardless of general convention (e.g. in a transliteration system that distinguishes them) would also be helpful. —Nizolan (talk) 00:58, 29 July 2019 (UTC)
@Nizolan: It definitely does not forbid straight quotes, there is explicitely written: "Use either straight quotation marks or the same quotation marks as the work presents". So if there are e. g. curly quotations marks in the work, contributors can use straight or curly. If there are double angle quotation marks, contributors can use straight or double angle ones. If there are straight quotation marks, they can use only straight. I think this point is quite clear.
The rule says what to do so that there were not internally inconsistent works. I do not think it is necessary to say what to do, if such situation occurs because somebody has broken the rule (e.g. a bot can be applied to make it right???).
As for "follow whatever convention has already been adopted in the work." This is exactly what I wanted to avoid, because IMO nobody should be forced to use curly quotes if they do not want to. Contributors who start a large work and know that they will need help of other contributors should also know that straight quotes are the default formatting and curly ones can be used only if they can ensure that the others are willing to follow. It is undesirable that somebody starts a new encyclopaedia, adds first 50 articles with curly quotes, and willy-nilly all others who come later have to follow it.
The rule "don't mess around with an article's established formatting preferences" does not help with works consisting of many articles, where different contributors might establish different formatting in different articles, which is undesirable. The last part of no. 4 is definitely open for better rewording, but the result should recommend straight quotes which can be changed in favour of the curly or other quotes only if the contributors are all able to agree on that.
I am not sure what you mean by "curly or straight quotes are required in a particular instance regardless of general convention", can you explain it in more detail, please? --Jan Kameníček (talk) 06:16, 29 July 2019 (UTC)
@Jan.Kamenicek: "as the work presents" means "use the formatting of the work", it doesn't give an option. If you meant something else it should be rewritten. On the last point, e.g. in transcription of Arabic ’ and ‘ represent different sounds; ' is used in some Russian transcriptions, etc., these must be preserved regardless of the general convention of the work. —Nizolan (talk) 10:34, 29 July 2019 (UTC)
@Nizolan: I am afraid I really do not understand the problem: "Use EITHER straight quotation marks OR the same quotation marks as the work presents." The option is clearly expressed.
As for the transcription symbols: these are not quotation marks and so imo they are not in the scope of this rule. --Jan Kameníček (talk) 12:22, 29 July 2019 (UTC)
@Jan.Kamenicek: I'll try to explain the problem for you: either ... or can be inclusive or exclusive. If read as exclusive, then your guideline states that if the usage is consistent, use the same marks; if it isn't, use typewriter marks. So, based on what you want it to say, it is not clearly expressed. (Edit: Also just to add, if it did just say "Use EITHER straight quotation marks OR the same quotation marks as the work presents" I think that would be fine—the problem comes with the "under the condition…" modifier.) —Nizolan (talk) 12:41, 29 July 2019 (UTC)
@Nizolan: Now I see your point. So what would you say to the following wording: "It is allowed to use either straight quotation marks or the same kind of quotation marks as the original work presents, with the following condition: Other than straight quotations marks are permitted only if it can be ensured that they will be used consistently in the whole work. If such consistency cannot be ensured (e. g. because of a large number of contributors to the work or because of disagreement of some of them), straight quotation marks are recommended." --Jan Kameníček (talk) 15:47, 29 July 2019 (UTC)
That seems decent to me. I'll add it as another option I'd support. —Nizolan (talk) 17:08, 29 July 2019 (UTC)
I support this point (no. 4) too, or, if it does not succeed, no. 3. --Jan Kameníček (talk) 05:20, 30 July 2019 (UTC)
I support statement 2, and do not support the others. In my opinion, imitating the source with respect to straight vs curly quotes is like imitating the source with respect to serif vs sans serif font. I would word statement 2 like follows: Either straight quotes or curly quotes are acceptable, but a consistent style of quotes should be used within a single work. Straight quotes are advised in collaborative projects to ensure consistency. —Beleg Tâl (talk) 12:29, 30 July 2019 (UTC)
Also, guillemets and other marks used for quotations are completely different characters, and should be used as in the source regardless of the straight-vs-curly conventions being used. —Beleg Tâl (talk) 12:32, 30 July 2019 (UTC)
Agreed on this. —Nizolan (talk) 23:32, 30 July 2019 (UTC)
┌──────┘
@Beleg Tâl: Various kinds of quotes are different characters and have their own Unicode or html codes. If you change Serif into Sans serif, only shapes of letters change. If you change one kind of quotes for the other, you changed the chosen character. If you change the kind of font used for quotes, their shape may change but the characters still stay.
It is also not clear what is meant by "curly" quotes, often mentioned here, since several characters used for quoting have curly shapes, and can be combined in various ways, e.g ” ” (&rdquo; &rdquo;); ‟ ” (&#8223; &rdquo;); „“ (&bdquo; &ldquo;); „” (&bdquo; &rdquo;); and their single variants like ’ ’ (&rsquo; &rsquo;) ...
What is more, various national typographic systems use different kinds of quotes and so the chosen kind can sometimes also have this national connotation (which is the reason why I decided to use „“ in Page:Guide to the Bohemian section and to the Kingdom of Bohemia - 1906.djvu/16 and the following pages, as it was published in Bohemia and so uses the kind of quotes typical for this region.) --Jan Kameníček (talk) 07:18, 31 July 2019 (UTC)
@Jan.Kamenicek: Curly quotes in this discussion are the ones that can be replaced with " and '—specifically these: “ ” ‘ ’ and possibly . Alternative quotation symbols like should never be replaced with these upper quotation marks. The fact that there does not exist any "straight" variant of is, as I said earlier in this discussion, one of the biggest reasons curly quotes should be permitted. However, in general, the appearance of vs " in a source scan, like the appearance of a vs ɑ, or s vs ſ, is generally an accident of publication and typeface, and is therefore beyond what Wikisource editors should be expected to reproduce. —Beleg Tâl (talk) 12:42, 31 July 2019 (UTC)
I modified Point 5 to reflect this. Levana Taylor (talk) 17:04, 31 July 2019 (UTC)

I have added another point, which positively encourages the use of curly quotes, just to round out the set of proposals by going the maximum distance. I'm not saying this is my favorite option, but I think it's at least a plausible one. Anyone who can think of a less extreme form of this option should add it as a subsection. Levana Taylor (talk) 16:51, 29 July 2019 (UTC)

Add me as someone who is in favor of typographic quotation marks and deprecating straight ones. —Justin (koavf)TCM 06:56, 2 August 2019 (UTC)

  • As there has been a recent dearth of discussion, I will follow the beliefs of the majority of editors and modify the style guide accordingly. The new wording is a simplified version of no. 4. unsigned comment by TE(æ)A,ea. (talk) 23:49, 11 August 2019‎.
Hang on, it’s not decided which option we prefer yet! For example, I have not yet put on record that I like number 3 best, followed by number 2, and don’t favor number 4. Levana Taylor (talk) 10:32, 12 August 2019 (UTC)
I have been aware of the above discussion, but have not contributed as I was waiting for a formal proposal. All that's happened so far is a discussion on how to word the proposal. Once this has been decided, then a formal proposal with the suggested options needs to be made in a new section away from this tldr discussion. Beeswaxcandle (talk) 07:09, 13 August 2019 (UTC)
I've just been alerted to this discussion, and have added my support for flexibility for well-established projects to the RFC above. In particular, the Wikisource:WikiProject 1911 Encyclopædia Britannica has preferred curly quotes and apostrophes for years. DavidBrooks (talk) 16:15, 28 August 2019 (UTC)

I thought I’d add another point. Any proposal that quotation marks should be represented by straight quotes would, I assume, be accompanied by requiring that apostrophes and embedded quotations be rendered using the straight apostrophe. If that is indeed part of the proposal, then it must include guidance on what to do with various forms of raised, reversed, and turned comma. Marks like Arabic rough breathing, Hebrew ayin (in Latin alphabet transcription), and the Gaelic raised “c” all appear in 1911 Encyclopædia Britannica, and all properly rendered in the printed version. These various marks have distinct Unicode points. To me, rendering these as the ASCII apostrophe is simply semantically wrong—which is what leads me to consider straight quotes and apostrophes as equally semantically wrong and ahistorical. We shouldn’t be hamstrung by the limitations that ANSI suffered from in the 1960’s.

Agreeing with you, and would like to mention that a single-quote and an apostrophe are two distinct punctuation marks, even if they resemble each other a lot. Rendering both as a straight mark collapses their appearance even more than necessary: with a curve, at least you can tell the difference between an apostrophe and a single quote at the left side of a word. Levana Taylor (talk) 16:57, 28 August 2019 (UTC)
@DavidBrooks: I think it goes without saying that marks like Unicode ayin ʿ that are formally distinct should be represented as such regardless of the guidelines on straight quotes—they are different symbols. The uncertainty, which I mentioned somewhere above, is over cases where a work uses curly quotes to represent something like ayin ‘ vs aleph ’. In that case, I've suggested that since turning these into straight marks would remove necessary semantic information, the appropriate curly mark must be preserved in the transcription regardless of the general guideline or whatever quotation style is being used in transcribing the work itself. —Nizolan (talk) 23:48, 29 August 2019 (UTC)