Inductiveload User Area
Main User Page Talk Page Gallery Contributions

WELCOME to my user talk page. Feel free to leave me a message if there is a problem or you would like my help, or anything else.

I am also active on Commons. If you would like help with a file I uploaded or would like me to make a file for you, please ask at my user talk page there. If the request is Wikisource-centred, ask here.

Anything you write on this page will be archived, so please be polite and don't write anything you will regret later! My purpose here is to make interesting and useful documents open to the public. I am never trying to make trouble, and any problems can almost certainly be resolved quickly and easily if everyone stays calm.

Please sign your posts by typing four tildes (~~~~) after your post, and continue conversations where they start. This helps to keep discussions coherent for future readers. If I leave a message on your page, then please reply there. My replies to messages left on this page will be here.

Wikisource user page Commons user page Wikibooks user page Wikipedia user page


Greek template and serif font displayEdit

Hi, any idea why Greek fonts (using the {Greek} template) no longer display with serifs? It started happening a few days ago. DivermanAU (talk) 03:53, 30 November 2021 (UTC)

The serifs disappeared for me a month ago, when the template was altered, and returned when I corrected the template to what it was before. Since they're now displaying for me, but not for you, then I would first suggest clearing your browser cache. --EncycloPetey (talk) 04:43, 30 November 2021 (UTC)
It's also sans for me if I remove my personal CSS, because the first in the old Template:Greek/fonts.css list that I have is "DejaVu Sans" (I imagine I share this with all Linux users, Windows users without special fonts installed will probably get Arial Unicode MS, but I'm not sure). As usual, a knee-jerk reversion as a first act is not particularly constructive. A constructive thing to have done here would have been to say what font your browser was actually using from the "styles.css" CSS and we could have addressed it properly.
@DivermanAU, please liaise directly with @EncycloPetey to find a font ordering that works for you both and please also bear in mind that most of the fonts in the list are not installed by most users. I have my own CSS anyway, so Works For Me (TM) whatever you do. Inductiveloadtalk/contribs 10:04, 30 November 2021 (UTC)
@EncycloPetey do you intend to address @DivermanAU's problem? Reverting something implies to me that you willing to take some level of responsibility for it, and I don't want to get in your way if you feel you have a better solution. Inductiveloadtalk/contribs 08:45, 2 December 2021 (UTC)
Any plans to reinstate Polytonic template @EncycloPetey? The way it stands currently the redirect to 'Greek' badly affects the look of Ancient Greek-style text. I've edited hundreds of pages in the past of the 1911 Encyclopædia Britannica that have Greek text to use 'Polytonic' to match the print. DivermanAU (talk) 11:27, 10 December 2021 (UTC)
Originally, we had {{polytonic}} for ancient and pre-20th century Greek. The template {{Greek}} was for modern Greek. It looks as though Inductiveload has changed that, though I do not know why. Perhaps we need to return to the old templates. --EncycloPetey (talk) 22:22, 10 December 2021 (UTC)
This is not correct. {{Greek}} was never for modern Greek, it has always applied the lang code grc. That's ancient Greek. Inductiveloadtalk/contribs 22:34, 22 December 2021 (UTC)
Can we please have the {{polytonic}} reverted to its previous behaviour of showing Ancient Greek text in polytonic format? The current re-direct to {{Greek}} seriously affects the display of ancient Greek text. Thanks DivermanAU (talk) 22:04, 22 December 2021 (UTC)
@User:DivermanAU What do you mean by "polytonic format"? In serifs? There's a three way conflation here between the encoding, the font and the lang code. Was it working for you before EP reverted it? Because it was originally broken for me, and is broken again after the revert (which is to say, it's using a sans font for me, but AFAICT that's never actually been the intention). As far as I was aware from our discussions before, it had been working for you as well until the recent revert? Inductiveloadtalk/contribs 22:43, 22 December 2021 (UTC)
By "Polytonic format" I mean the font display, currently as shown when using {{Polytonic2}}, serifs and variable width lines (for an example, see User:DivermanAU where I compare). EP didn't revert the {{Polytonic}} template, his last edit to it was in November 2018. I am requesting that {{Polytonic}} be reverted from a {{Greek}} redirect back to the November 2018 version so that Ancient Greek text displays in the Ancient Greek font style again. I don't know why the font display changed, but I know it's not a Windows issue as my ChromeBook has the same problem. DivermanAU (talk) 23:18, 22 December 2021 (UTC)
Yes, but it's not clear what the difference is supposed to be between {{Greek}} and {{polytonic}}. They're both just ways to say "ancient greek" (i.e. grc), but they used to be completely separate for no real reason. "Polytonic" or "not" is an encoding thing, not a language thing, and it sounds to me like we should bind the "serif-y" display of the fonts against the language (i.e. grc), as opposed to the encoding. I.e. all Ancient Greek, polytonic or not, is in the same font.
So then the question is do you want Ancient Greek to be styled as serif or not. I think yes. If you do, the November revert has broken that for both templates for me and you, it seems, though it apparently(?) EP is fine, perhaps they have one of the "special" fonts at the head of the list installed. For me, I get DejaVu Sans, I don't know what you get.
If the answer is "just make all Ancient Greek serif", then the November revert was wrong because it demonstrably does not work and places many sans fonts first, and should instead have been a adjustment of the fonts in Template:Greek/styles.css to ensure it also covers whoever system and font pallette EP has, which I don't know and hasn't been shared with us.
By the way, the "old status quo" you seek was actually still pretty bad for at least some people (like me) because the default polytonic fonts was always a completely dreadful font on this computer (I think it used to be Code2000, but I have changed a few fonts recently for other reasons, so now it's coming up with something else, which is certainly better than it used to be).
Can we at least be clear on what you were seeing before November? Because I understood then that it was working at least for you (and it was working for me, or at least it was in serifs).
It might actually be a simple change: does this work? "Ἀθῆναι"? Inductiveloadtalk/contribs 00:00, 23 December 2021 (UTC)
Thanks for looking into this further. Yes, there may be some confusion about terminology — I was under the impression that {{Polytonic}} was supposed to display the 'serif-style' fonts. I'm not focussed on the language encoding, but I understand it makes sense to flag the language as a particular type. Earlier than November, the serif-style fonts were displaying when using {{Polytonic}} template (which was, and is still currently, a redirect to {{Greek}} template) but that seemed to change maybe around mid-November. EP tried a change to the Greek template on 28 November — as they had noticed that serif fonts weren't showing either. But your test above is displaying the serif-type fonts for me, so if they can can be implemented, that would be great! Thanks again for your efforts. DivermanAU (talk) 03:23, 23 December 2021 (UTC)

┌─────────┘

{{Greek}} and {{Polytonic}} may have set the same lang code, but they set different CSS class names: .wst-lang-grc and .polytonic. Template:Greek/fonts.css has different font lists for the two classes, and uses .grc for the non-polytonic class. That means there are plenty of opportunity for differing behaviour just based on the CSS differences. And since the class name now mismatches, so the style is never applied, the current behaviour relies entirely on whatever automagic the webfonts stuff happens to trigger based on @lang=grc. This should be adding a webfont download of GentiumPlus, but it may be modifying that based on what @font-family is being applied through other means (it was designed to look at inline font specifications; how or whether that's been adapted to TemplateStyles, or site CSS for that matter, I haven't been able to determine).

#1 .grc Ἀθῆναι
#2 .polytonic Ἀθῆναι
#3 .grc+@lang Ἀθῆναι
#4 .polytonic+@lang Ἀθῆναι
#5 @lang Ἀθῆναι
#6 {{lang}} Ἀθῆναι
#7 @lang+@font-face (.grc) Ἀθῆναι
#8 @lang+@font-face (.polytonic) Ἀθῆναι
#9 @lang+@font-face (.wst-lang-grc) Ἀθῆναι

I am seeing (MacOS) definite differences between the display under the font lists for .grc and .polytonic. With @lang the difference disappear, and both look like just .grc. With just @lang it looks like it does with just .polytonic. The variants with @lang and an inline @font-face look the same as with @lang and .grc/.polytonic, so I'm guessing ULS reads and adapts to font-face rules present both inline and in TemplateStyles. So the major difference to consider is probably using just @lang (no local styling) vs. @lang + some local styling. And that depends on whether the same display quirks apply for all browsers and platforms. --Xover (talk) 10:25, 23 December 2021 (UTC)

@Xover Greek and Polytonic are the same thing: the latter is now a redirect to the former.
The problem is that the template was reverted carelessly without actually looking at what it's doing. So the template now doesn't match the CSS at all. The CSS for wst-lang-grc at Template:Greek/styles.css, has only one rule, the idea being that all Ancient Greek (grc) would style the same. No template is using the class names at /fonts.css (which is, obviously then, why it's not working). In theory, the grc shouldn't need any templates other than to set the lang attribute: ULS does that rest and allows user control. But...
There is a glaring deficiency of ULS UI (phab:T289777): there's no obvious way for a user to configure the webfont for any language that's not the system interface language without a faff of changing the interface language to Ancient Greek, then changing the font, and then returning to English. But, we could indeed set Gentium as a default for users who have not set any other preference: the code looks something like this:

Example

	mw.loader.using( 'user.options', function () {

		var prefs = JSON.parse( mw.user.options.get( 'uls-preferences' ) || {} );

		if ( !prefs.webfonts ) {
			prefs.webfonts = {};
		}
		if ( !prefs.webfonts.fonts ) {
			prefs.webfonts.fonts = {};
		}

		var changed = false;

		// the user hasn't specifically set a font here, use Gentium
		if ( prefs.webfonts.fonts.grc === undefined ) {
			prefs.webfonts.fonts.grc = 'GentiumPlus';
			changed = true;
		}

		if ( changed ) {
			var val = JSON.stringify( prefs );
			mw.loader.using( 'mediawiki.api', function () {
				var params = {
					action: 'options',
					optionname: 'uls-preferences',
					optionvalue: val
				};
				new mw.Api().postWithToken( 'csrf', params )
					.then( console.log );
			} );

			mw.user.options.set( 'uls-preferences', val );
		}
	} );
Probably needs more thought for logged-out users? Inductiveloadtalk/contribs 10:41, 23 December 2021 (UTC)
Well, my thought was… Do we need to faff about with it? Does using only @lang=grc alone produce reasonable-ish results (through ULSs meddling)? It doesn't have to be perfect, or any given user's most preferred font, just so long as it is acceptable-ish and works consistently. Then we can lobby ULS for a better default if we really need to. I have no idea how this is actually supposed to look, but given the main complaint seems to be serif vs. sans the presence of several sans fonts in our myriad @font-face rules is rather an odd choice. Hence, my theory that we're 1) making this more complicated than it needs to be, and 2) creating weirdness and inconsistencies where none needed be. But, hey, Greek is almost entirely Greek to me, whether modern or ancient, so I may just be confused. Xover (talk) 11:46, 23 December 2021 (UTC)
Oh, and none of the above table used the templates (apart from raw {{lang}}). I was referring to the CSS classes, and .grc and .polytonic (and .wst-lang-grc) use different @font-face specifications. Xover (talk) 11:54, 23 December 2021 (UTC)
@Xover I completely get the idea there, and broadly it's my feeling too: setting any fonts in the markup is presumptive and takes away from the user-agent's control - we should set @lang=grc and let it be configured from there (with a ULS polyfill, perhaps). And also setting actual font-families is always fraught unless you're serving the webfonts because there are no guarantees about who has what. But, clearly, that is not a universally-shared feeling.
I suspect that the serif thing is just because lots of people assumed polytonic meant serif, when it actually just happened to mean serif for them. It did come out as serif for me, albeit in a very ugly font.
BTW, one more data point for the pile: on Arch/Firefox and without other ULS shenanigans or classes, just @lang=grc comes out as DejaVu Sans because the body { font-family: sans-serif; } is "winning" for me. Though since I have now set my ULS grc font to Gentium, it does look rather nice if I allow ULS to run (or I could install ttf-gentium-basic). Inductiveloadtalk/contribs 12:06, 23 December 2021 (UTC)
@DivermanAU, @EncycloPetey: Which, if any, of the examples in the table above looks as you expect "polytonic" to look? If none do, is there any among them that look at least minimally acceptable even if not actually good? Xover (talk) 12:53, 23 December 2021 (UTC)
Two issues here: (1) polytonic, @lang, and {{lang}} look different from the other options, but (2) none of them have serifs. The serif option is not supported in this namespace, so no test in this namespace will work properly. --EncycloPetey (talk) 20:35, 23 December 2021 (UTC)
This is completely unrelated to the serif option. That just sets serif on the whole content block. Since the template use a class or an attribute selector, they'll be more specific and override the global settings (which is why forcing font families on readers causes these issues in the first place). Inductiveloadtalk/contribs 21:28, 23 December 2021 (UTC)
Thanks Xover for investigating. For me, 5 out of the 8 table examples appear in "polytonic" (i.e. with serifs) — these are: .grc, .grc+@lang, .polytonic+@lang, "@lang+@font-face (.grc)" and "@lang+@font-face (.polytonic)". Or, if it helps, numbers 1,3,4,7,8 in the table appear "polytonic". This was on a Windows 10 Enterprise PC (both using Chrome and Edge browsers), same result for a second Windows 10 Home PC and a work Win10 Enterprise PC. On a MacBook Pro, none of the fonts appear in 'polytonic', but polytonic fonts didn't display before either — presumably because no 'polytonic' style fonts are installed on it. DivermanAU (talk) 21:02, 23 December 2021 (UTC)
Update - I hope you don't mind but I fixed a typo in your table, you had "plytonic" instead of "polytonic" in the second row, so this now means I see 'polytonic' (or serif fonts) in examples 1,2,3,4,7,8. DivermanAU (talk) 21:14, 23 December 2021 (UTC)
For me (without any ULS override), it's like this: phab:F34894469. 2, 4, 8 and 9 end up in serif fonts (which are actually a mixture of P052 and DejVu Serif, except #9 which which is all DejaVu Serif, FWIW). I added one more option at the end (#9) which is what would be displayed if {{Greek}} pointed again to the CSS that matches the class set in the template (it's the same as it was in November, but the last fallback entry is serif now - proably that's the actual fix that should have been made). Inductiveloadtalk/contribs 21:18, 23 December 2021 (UTC)
@Xover: My view of the table is below:
  (the ninth recently added looks the same as the eighth to me). DivermanAU (talk) 22:14, 23 December 2021 (UTC)
On my ChromeBook, the display was virtually identical to that of Inductiveload (rows 2,4,8,9 in the table appear with serifs). DivermanAU (talk) 01:30, 24 December 2021 (UTC)
One more: an Android device (who knows what fonts it actually has!) on Firefox, only #9 is serif for me. Inductiveloadtalk/contribs 08:52, 24 December 2021 (UTC)

┌───────────────────┘
Let's see if I've got this correct:

# Code Rendering DAU
(Win10)
DAU
(Chrome-
Book)
IL IL & DAU
(Android)
EP
#1 .grc Ἀθῆναι OK        
#2 .polytonic Ἀθῆναι OK OK OK[1]    
#3 .grc+@lang Ἀθῆναι OK        
#4 .polytonic+@lang Ἀθῆναι OK OK OK[1]    
#5 @lang Ἀθῆναι          
#6 {{lang}} Ἀθῆναι          
#7 @lang+@font-face (.grc) Ἀθῆναι OK        
#8 @lang+@font-face (.polytonic) Ἀθῆναι OK OK OK[1]    
#9 @lang+@font-face (.wst-lang-grc) Ἀθῆναι OK OK OK OK  
  • [1]: This is all in serifs but mixes two fonts within a single word: Ἀ and ῆ are in DejaVu Serif, something else for the others

@EncycloPetey: Based on your comment above it sounds like none of these show up in serif for you. Is that correct? And as a separate question: I think I pick up from discussions elsewhere (I could be wrong) that you disagree that the "correct" display is using serifs? If I understood that correctly, are there any of the variants above that display correctly according to how you personally think they should look?

As I see it there are two separate issues to address: 1) getting a template/font setup that displays polytonic Greek correctly in the sense "capable of displaying all five diacritics"; and 2) shows in a typeface that is visually pleasing (a subjective issue). My suspicion is that the two issues are being conflated, and that if we can disentangle them we'll be able to nail down the first issue properly. And if we do that we can more easily try to address the second through community discussion of what the default should be, and which aspects need to be configurable. And let me just say from the outset that any @font-face specification that contains a mix of serif and sans typefaces is a big huge red flag: the results of that in any given web browser is going to be essentially random and unpredictable. --Xover (talk) 10:37, 24 December 2021 (UTC)

You are correct that none of the above versions display with serifs for me. The problem is that, we have an option for turning serifs "on" or "off", but this functionality does not work for polytonic Greek. When the {{polytonic}} template was redirected to the {{Greek}} template, and that template was changed, any option for serifs was lost for me. The non-serif fonts that I get make it nearly impossible for me to read diacritics. For example, I have extreme difficulty distinguishing between smooth and rough breathing marks in the non-serif fonts. There are many other readability issues for me with no ability to read Greek with serifs. For me, the primary issue is readability.
A second issue is that, since serifs cannot be turned on or off in the Greek text for me, if I turn on serifs when reading a work, serifs only activate in certain namespaces, but not others. I cannot get serifs in the Page namespace, or the Template namespace, or any other namespace. This is a general issue with the serifs option. But, along with that issue, when I do turn on serifs in the Main namespace for a work, serifs for the Greek text does not turn on, so I get serifed Latin text, but serif-free Greek text on the same page. If a user turns serifs on, it should turn serifs on, not turn some serifs on and not others.
If we're going to tell users "you can turn serifs on or off at your discetion", then that functionality should actually work. --EncycloPetey (talk) 19:17, 24 December 2021 (UTC)
I've done further testing on other devices, for an Apple iPad the only font displayed with serifs was #9. For a Windows 8 laptop, the serif fonts were #2, 4, 8 & 9 (same as for ChromeBook) although #9 was a different font (it looked a little bolder but perfectly acceptable). So, I believe we have enough evidence to change the {Polytonic} template to that of #9 above. This would enable most platforms to display Ancient Greek text in the Ancient Greek style.DivermanAU (talk) 20:57, 4 January 2022 (UTC)

Going back to first principles (Take 2)Edit

Ok, there seem to be multiple issues to figure out here, but to try to tackle them a bit at a time…

@DivermanAU, @EncycloPetey, and @Inductiveload: Does the below box seem to show the Greek in an at least reasonable typeface?

ἀἁἂἃἄἅἆἇὰάᾰᾱᾶᾳᾲᾴᾀᾁᾂᾃᾄᾅᾆᾇᾷ
ἠἡἢἣἤἥἦἧὴήῆῃῂῄᾐᾑᾒᾓᾔᾕᾖᾗῇ
ἰἱἲἳἴἵἶἷὶίῐῑῖῒΐῗ • ἐἑἒἓἔἕὲέ
ὀὁὂὃὄὅὸό • ῥῤ
ὑὓὕὗὺύῠῡὐὒὔὖῦῢΰῧ
ὠὡὢὣὤὥὦὧὼώῶῳῲῴᾠᾡᾢᾣᾤᾥᾦᾧῷ
'ΑἉ'Ε'Ε'ΙἹ'ΟὉὙ'Υ
ᾺἈἉῈἘἙῚἸἹῸὈὉῪὙ
Ἀ Ἐ Ἠ Ἰ Ὀ Ὠ (Unicode pre-composed Greek capital letters with smooth breathing (psili))
Ἀ Ἐ Ἠ Ἰ Ὀ Ὠ (Unicode Greek capital letters with combining comma above)
Ἁ Ἑ Ἡ Ἱ Ὁ Ὑ Ὡ (Unicode pre-composed Greek capital letters with rough breathing (dasia))
Ἁ Ἑ Ἡ Ἱ Ὁ Ὑ Ὡ (Unicode Greek capital letters with combining reverse comma above)
Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω (Unicode Greek capital letters)
α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ/ς τ υ φ χ ψ ω (Unicode Greek minuscule letters)
ϐ ϑ ϛ ȣ Ϝ ϝ Ϲ ϲ ϖ ᾽

ᾺἈἉῈἘἙῚἸἹῸὈὉῪὙ — This line using {Polytonic2} template for comparison.

By my calculation the above should show in a serif font that supports all the characters present in the box (ignore the "•", it's just a marker I added for… reasons). It should also all be a single font (no sneaky characters gypped from a different font). --Xover (talk) 15:01, 5 January 2022 (UTC)

Yes, to me (on a friend's Windows 8 PC), the Greek text appears in a serif font, the different diacritics appear clear. DivermanAU (talk) 20:50, 5 January 2022 (UTC)
The lowercase letters that are shown do display correctly, however the font does not support smooth breathing marks on capital letters. The lowercase sigma looks odd, and there may be additional issues, but not supporting smooth breathing marks on capital letters is a serious enough issue. --EncycloPetey (talk) 23:35, 5 January 2022 (UTC)
@EncycloPetey I added an extra line to the sample text above (using characters from the "Greek" dropdown option). Do these look better? I also just added sample text using {Polytonic2} to compare. For me the last two lines show curly smooth breathing marks correctly. DivermanAU (talk) 06:06, 6 January 2022 (UTC). — I also just added sample text using {Polytonic2} to compare. For me the last two lines of Greek text show curly smooth breathing marks correctly.DivermanAU (talk) 08:07, 6 January 2022 (UTC)
@EncycloPetey: There is (was) no sigma in the above letter specimen? It was just a semi-random selection of letters with diacritics that I grabbed for convenience (i.e. lazyness).
But I have now added examples for all the vowels in capitals with rough and smooth breathing marks, using both pre-composed characters (a single Unicode code point that includes the diacritic) and combining characters (the base Greek character and a "combining" diacritic character as two code points that your browser / operating system combines on display). I have also added full-alphabet samples for upper- and lower-case letters without any diacritics for reference.
Could you take another look and see if we're getting close to something usable? The new samples are the six last lines; the ones that have a comment in parenthesis after the sample). --Xover (talk) 08:58, 6 January 2022 (UTC)
@EncycloPetey, @Xover, for me (both on Windows 10 and ChromeBook), the new samples look good to me — all diacritics look curly and all characters have serifs. DivermanAU (talk) 11:11, 6 January 2022 (UTC)
Looks good to me on desktop (at least, it's using Gentium, which it's loading via ULS, and it looks like this: phab:F34909344). This doesn't happen on the mobile subdomain, so it just uses DejaVu Serif (I think that is Firefox trying to be clever, since if I removed 'font-family: GentiumPlus;', it uses DejuSan, the normal default). Rough breathings and sigmas look fine to me. Inductiveloadtalk/contribs 14:02, 6 January 2022 (UTC)
I've added one additional line of rare characters that have cropped up in works I've transcribed here. They look good to me, but @Inductiveload: and @DivermanAU: should look that line too. Everything looks good to me now. --EncycloPetey (talk) 01:40, 7 January 2022 (UTC)
@EncycloPetey: — Those additional rare characters look good to me too (on both Windows 10 and ChromeBook). DivermanAU (talk) 03:23, 7 January 2022 (UTC)
@DivermanAU, @EncycloPetey, @Inductiveload: Awsome. Thanks! I've updated {{greek}} with this approach now, so hopefully it will now produce both more consistent and better results. I'll wait a while before deleting {{polytonic2}} and other cleanup in case there's other issues that crop up, but other than that I believe we're more or less where we need to be for right now (Yay!). But this whole system is really finicky and fragile (it involves interactions between about a gazillion different components, including the details of each user's operating system and web browser, and really esoteric features of them all) so in future, if similar problems crop up, it's probably best to take an analytical approach ala. the above rather than reverting or making changes to various bits and pieces: the latter is apt to create both more problems and more confusion. Feel free to ping me, of course, if you think I can help, but I should note that in saying that I am in no way shape or form implying any particular expertise, just offering to help. :) Xover (talk) 09:44, 7 January 2022 (UTC)
@Xover thanks very very much for the effort untangling it! Inductiveloadtalk/contribs 09:51, 7 January 2022 (UTC)
@Xover: thanks for sorting this out. One minor query, the new {Greek} template now produces a display which shows a smaller font size than it used to. (See my talk page for a comparison, ninth line down; the {Polytonic2} template shows a 'normal' font size.) DivermanAU (talk) 19:29, 7 January 2022 (UTC)
@DivermanAU: The font size for all these is the same: calc(1rem * 0.85). But the other templates are picking a different font (which one depends on your OS and browser, so I can't check which; on my system it picks either Helvetica or Arial Unicode MS), and those may have both a larger x-height and a heavier stroke. This is generally the sort of variability one has to live with on the web since the reader's web browser is not directly under our control. Xover (talk) 20:10, 7 January 2022 (UTC)
@Xover: thanks for the explanation. But is there a reason why the font size is reduced to 85%? DivermanAU (talk) 20:23, 7 January 2022 (UTC)
@DivermanAU: You'd have to ask the WMF. They set everything to 85% of your base font size. The first thing any designer does is reduce the font size (typically to 80%) and increase the line-height (by 1.2–1.4). I think it's some kind of religious tenet or something. I mean, I'm sure it looks prettier in running text and all, but I've never quite understood why the browser defaults aren't a good enough default for most web sites. But I digress… We inherit that from the MediaWiki skin (and all the MW skins set this roughly the same). Xover (talk) 20:43, 7 January 2022 (UTC)
@Xover: can we adjust the font-size in the {Greek} template to 117% to compensate for the 85% in the MediaWiki skin? That would produce a 'normal' size font. DivermanAU (talk) 21:50, 8 January 2022 (UTC)
@DivermanAU: You misunderstand: it's not the Greek font that is set to 85% of the rest of the site; it's all body text on the site that is set to 85% of whatever font size you have set in your web browser. The Greek text has the exact same font-size specification as the surrounding text. Adjusting just the Greek text would bring it out of line with the surrounding text and make all formatting templates behave unpredictably. Xover (talk) 22:22, 8 January 2022 (UTC)
@Xover: - thanks again for the explanation. I just saw that text with the new {Greek} template
ΑαΒβΓγΔδ — {Greek} template
 is smaller than before and smaller than using {Polytonic2}
ΑαΒβΓγΔδ — {Polytonic2} template
I guess it's the font itself that get chosen when using {Greek} now (Gentium Plus) is actually smaller, I get "Palatini Linotype" when using {Polytonic2}. DivermanAU (talk) 06:00, 9 January 2022 (UTC)

Google OCREdit

HI, i have problems with Google OCR. Several times, not work (In my bot several pages have more 200 attemps to save a page with Google OCR). Now, in Spanish Wikisource and here, not work. I need to make several attempts (very much) to OCR works. I don't know if the problem is mine, or it is from the mediawiki system, you know? Shooke (talk) 15:26, 18 December 2021 (UTC)

I don't know, it's something between the mediawiki thumbnail service and the Google OCR service. The bug is tracked at phab:T296912, but I don't have any further clues to what's going on beyond what I wrote there (and I don't have access to the logs to figure out it out for sure). In the meantime, you could try the Tesseract OCR instead? Inductiveloadtalk/contribs 15:53, 18 December 2021 (UTC)
thanks for answering. Regarding Google's OCR, it is very good with Spanish, tesseract has quite a few shortcomings for this language. December 19 was perfect, not errors. But today not. It seems that the error is intermittent. Shooke (talk) 02:38, 21 December 2021 (UTC)

Index:The Poems of John DonneVolume 2 - 1906.djvuEdit

Hello. Index:The Poems of John DonneVolume 2 - 1906.djvu, founded by Inductivebot, has recently been nominated for speedy deletion because the File:The Poems of John DonneVolume 2 - 1906.djvu does not exist. The summary of the file page says "File description for failed upload, pending server-side upload". What is the problem with the upload and is it going to be fixed? --Jan Kameníček (talk) 22:28, 21 December 2021 (UTC)

@Jan.Kamenicek sorry that was a borked redirect I think. The files should have been like Index:The Poems of John Donne - 1896 - Volume 1.djvu. Deleted. Inductiveloadtalk/contribs 22:33, 21 December 2021 (UTC)

Template:Auxiliary Table of ContentsEdit

Hello. It seems that something has gone wrong with the {{Auxiliary Table of Contents}}, see Page:Czechoslovak fairy tales.djvu/15. Is it possible that is is connected with this edit? --Jan Kameníček (talk) 12:33, 22 December 2021 (UTC)

@Jan.Kamenicek the issue is not related to that. The problem is that if you un-named parameters, whitespace is not removed. Lines that begin with whitespace are made into "pre" code-like blocks. You can fix it by removing the space before (shudder) the {{Dotted TOC page listing}}, or using the parameter name (1)
However, I personally think that an even better solution would be to use the wst-toc-aux class like this, since then you don't have to break the table up or use templates that don't export properly:
{{TOC begin|width=100%}}
|+ {{larger|CONTENTS}}
{{TOC row 2dot-1|class=wst-toc-aux|Note (not in original TOC)|vii}}
{{TOC row 2dot-1|Foobar|18}}
{{TOC end}}
CONTENTS
Note (not in original TOC)
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
vii
Foobar
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
18
Inductiveloadtalk/contribs 12:43, 22 December 2021 (UTC)
That solution looks good. Thanks, I think I will use it. However, the extra space did not cause any problems until recently, so hopefully there are no more pages like this where this problem has suddenly arisen. --Jan Kameníček (talk) 12:54, 22 December 2021 (UTC)
Page reworked, thanks for pointing this option out, I will remember it! --Jan Kameníček (talk) 13:09, 22 December 2021 (UTC)

A question about a deleted pageEdit

I have a question. I use anchors, unless there is a wikidata that can be filled with whatever is anchored. So, with that background I ask this:

Was this page attached to a wikidata? Early Spring in Massachusetts (1881)/Nuthatches-1854-02-24?--RaboKarbakian (talk) 16:26, 22 December 2021 (UTC)

It wasn't, but why on Earth would it be? It doesn't represent a discrete unit of text. If you need to reference it, it would be the edition Q-id (Q50824926), plus page(s) (P304), section, verse, paragraph, or clause (P958), line(s) (P7421), volume (P478), etc, and a reference URL (P854) that can include a fragment for the anchor. Spraying separate, reduplicated, transclusions of tiny snippets through the WS mainspace just so you can construct an item for it with a sitelink from Wikidata would be utter madness: putting the cart so far before the horse that you'd need another horse just to find it. Inductiveloadtalk/contribs 16:42, 22 December 2021 (UTC)
I don't know. It as on my watchlist, it was about nuthatches (a lovely little bird with a terrible nasal song). It was deleted: so I don't know if I created it or added to it or moved it, and I don't know what was there even. So, I eliminated everything I don't care about and arrived at the one thing I do care about which is a wikidata link.
Reasons to make a page for a sitelink include: biological descriptions of species, genus, or family which, even if they are not the type species still get cited often. Mathematical formualas, especially in original proofs or as manipulated from the original for use. Recipes. I am sure there are more but this is off the top of my head right now. But anything that could be isolated from the text and stand alone as a "something". Type species, type genus and type families are first and foremost on that list. Biologists have been referencing type species since the 1600s and now that technology can really make the reference (not just an L1753) rules made by "English literature majors" with deletion tools (and other implements of distruction) prevent it from happening.--RaboKarbakian (talk) 18:08, 22 December 2021 (UTC)
@RaboKarbakian Most of those things are things that you make a reference for, not an item. A paragraph of an edition is not a standalone concept that would be modelled by a Wikidata item. It it able to be represented in a triple store, of course, just like anything can be, but, for the same reason that paragraph 67 of Spot the Dog Goes to the Supermarket, 2nd ed. does not have its own item, it is not worthy of a Wikidata Q-id.
Even if you did make a Wikidata item for your snippet (perhaps it's very famous like a verse of the Bible*), you should not make an orphaned, duplicated Wikisource page simply so you can have a sitelink for that Wikidata item. Wikisource's layout is not driven by any perceived need to fill in sitelink boxes at Wikidata. If you need to reference a specific text location at Wikisource for a WD reference, then you can use the reference URL (P854) and an anchor.
(*) Actually, even though they have a Genesis 1:1 item, they do not have an items for "Genesis 1:1 in the NIV/KJV/WTFBBQV". Inductiveloadtalk/contribs 18:43, 22 December 2021 (UTC)
See commons:Category:Aporocactus flagelliformis at the top, it's TOL. The last item is the species name and behind that is a little L. That L. links to commons:Carl von Linné. They have been making that "link" since the 1600s, they being the biologists. It is there so the original can be found. There is another author behind that, as that species has had an updated and improved description. It could just as easily link to the text of the first (and updated) description of that if it were here. A text I worked on here that had a bunch of those "firsts" had them so embedded in the text that it was just easier to make a stand alone page for the species. And also, the first genus which needs the species, in that same book. It was an important book, a not so interesting portion of a series of actually interesting books, and the book as a whole is good to have.
So, I recommend that you spend some years with tree of life information and how it is set up, the different trees and the authorities and how the information fits into the several trees and branches out and then express your opinions about what qualifies as a good stand-alone or not. Or, skip that couple of years and trust someone who has done that. TOL is kind of interesting in the database management sense. Also, how much even science is politicized. I often had to decide to go with the name being used by the closest group to where the species was found. Like, New Zealand for Antarctica, etc. It is difficult for me to call anything I need to make a decision about a "science", but there you have it. Things need to have a name if they are to be discussed. But I am getting away from the point.... It would not be orphaned in the greater sense of the word. It is more likely that the complete book be orphaned due to not being able to link the important/specific parts at wikidata. If wikidata would accept anchored web links, this discussion would not be happening.--RaboKarbakian (talk) 19:06, 22 December 2021 (UTC)
@RaboKarbakian I don't understand your point. I understand Linnaean classification and biological authorities perfectly well, thank you. Commons having a category for Disocactus flagelliformis is not a surprise. The appropriate Wikisource sitelink for Disocactus flagelliformis (Q310976), if any, would be Portal:Disocactus flagelliformis, if it existed, and not some random paragraph in an 1881 book where it was mentioned. What you might do is link or reference some claim (maybe a reference on taxon name (P225) or described by source (P1343)) to the Wikidata item for a WS edition and qualify that with a page number, and a direct URL, perhaps with an anchor.
Your assertion about not accepting "anchored links" is incorrect: Wikidata can and does store full URLs:
Both of these will accept fragments (i.e. #...) as well as query parameters.
If you mean that sitelinks cannot have fragments, then you are correct, but, despite however many years of whatever it is, you may have misapprehended what a sitelink represents: it is not a shorthand URL, it is a statement that the sitelinked pages all relate to the same concept. The only item that could reasonably have a sitelink to a page Early Spring in Massachusetts (1881)/Nuthatches-1854-02-24 would have been an item about that exact paragraph. That hypothetical item would probably have, amongst others, main subject (P921)Sitta (Q858577) (or maybe a specific nuthatch species if you knew which).
If you did actually have such a narrowly-defined item to fill a structural need (and you probably do not have such a need, but let's say you did), it still wouldn't justify either artificially splitting up WS works or creating a "shadow realm" of redundant transclusions at WS just because that would allow a sitelink at Wikidata. Inductiveloadtalk/contribs 19:50, 22 December 2021 (UTC)

The Smart SetEdit

Sorry about the mix up with the volumes. Must of been a product of tiredness. What should we do about the volumes that only exist as sim sets? Should I make a batch file to upload them individually or is there a way to combined the issues into volumes? Languageseeker (talk) 11:54, 23 December 2021 (UTC)

@Languageseeker It's OK, it happens. Do we have a complete volume of any of them? Otherwise maybe just leave them as they are (63:1, 64:1, 64:3) for now and "someone" can merge the indexes and pages when/if complete volumes turn up somewhere? Inductiveloadtalk/contribs 12:18, 23 December 2021 (UTC)
I've looked and can't find them anywhere. I'll make a batch file to upload the individual issues in the next few days. As always, thanks for your help. Languageseeker (talk) 12:58, 23 December 2021 (UTC)
Okay, sounds good. You're welcome :-) Lippincott's is coming next. I'm getting my money's worth from the ISP this month!
In general, we can upload single issues perfectly well. Just set an issue heading and I'll make it work somehow. The bulk of the grind is downloading and converting the images in the first place, and uploading the file with some kind of sane-enough metadata. Combining multiple indexes can happen "later", if and when complete volumes deign to appear.
It's not really a huge issue if we have a bit of a mishmash of volumes and issues - it's just easier not to if vaguely possible. Manually recombining issues and faking volumes up is more work than just living with the mishmash. After all, after transclusion, it doesn't even show at all! Inductiveloadtalk/contribs 13:08, 23 December 2021 (UTC)
So, it seems that the MJP pdfs are b&w, while the images are full-color. Do you think it's possible to download the images, make them into DJVU, and upload them? Languageseeker (talk) 23:35, 30 December 2021 (UTC)
I technically could, but it'll be a pretty huge amount of downloading and processing, plus I'll need to write a backend for the download script to scrape the IIIF manifest. The high-res loader works and as far as I can tell the magazine was printed in black and white anyway, so it's not really losing any detail. Images should never be cropped from the PDF (or DJVU) anyway. What's the use case here? Inductiveloadtalk/contribs 00:17, 31 December 2021 (UTC)
Also at least some are in colour (I haven't looked into it further, but this one has a colour plate at page position 12 and all other pages are in colour: https://repository.library.brown.edu/studio/item/bdr:568723/PDF/) Inductiveloadtalk/contribs 00:33, 31 December 2021 (UTC)
Alright, I don't want to pile on even more work on you. Happy New Years! May it bring you all the best! Languageseeker (talk) 02:12, 31 December 2021 (UTC)

Batch Downloading from Modern Journal ProjectEdit

As I was looking for The Smart Set volumes, I stumbled across [1]. Do you think that you could batch download from that site? The format for the pdf seems to be https://repository.library.brown.edu/studio/item/id/PDF/ For images, it appears to be https://repository.library.brown.edu/iiif/image/picture id/full/,3600/0/default.jpg . I think that if you use the range bdr:471301 - bdr:563458 for the image link, it should download all the images on MJP. The challenge will then be to assemble them into individual volumes. Languageseeker (talk) 18:15, 24 December 2021 (UTC)

@Languageseeker if there's actually a URL to get a document file from, you can just set source=url and then set the id to the URL in question. Inductiveloadtalk/contribs 18:39, 24 December 2021 (UTC)
That said, I can (and will) add a "mjp" source to shortcut it. But it'll probably still just be the PDFs, or it's an order of magnitude slower and the MJP PDFs seem OK for quality and OCR.
Also, related plug: the hi-res loader already understands the MJP: e.g. Page:Camera Work No. 12 (October 1905).pdf/38. Inductiveloadtalk/contribs 19:12, 24 December 2021 (UTC)

p-p-p-p-p- p-wrapping everywhereEdit

Sigh. P-wrapping is the bane of every attempt at sanity. All-one-stanza (p)poem spanning three pages, which needs LST due to intermixed textual notes, where the middle page ended up getting wrapped in a p tag (with attendant vertical spacing) unless I manually fudged the LST end tag onto the same line as the end of the ppoem. Might be worthwhile keeping in mind if mysterious "stanza breaks" start popping up. Xover (talk) 09:27, 25 December 2021 (UTC)

So the first day of Christmas is a paragraph in a parse tree is it? I'll keep my eyes half open for bad interactions but at some point I'm just going to grumpily shift the blame onto "Mediawiki" in general and gesture vaguely at "the parser". Merry everything! Inductiveloadtalk/contribs 23:09, 26 December 2021 (UTC)
Heh heh. Merry merry to you too. And in this case it is most definitely MW's fault: there's nothing we can do on the content side to affect this, except pray someone will tackle T253072 (cf. T134469) eventually.
PS. If you have any suggestions on how to handle /164 and /165 I'm all ears. I really don't want to do them as dumb images-of-text, but I can't think of any way for us to even approximate that layout without full-on arbitrary webfont support. Xover (talk) 07:17, 28 December 2021 (UTC)
Any thoughts on this approach to /164 and /165? I'm not particularly happy with it, but it's the least worst one I was able to come up with. Xover (talk) 11:44, 30 December 2021 (UTC)
@Xover whoops sorry, I forgot to reply here. I really don't think there's much more we can do here. Even if we were to ship the modern equivalent fonts, it's still not quite the same as the original, and the exact form of the font is the content in this case. Even if you could channel the spirit of William Caslon though FontForge and generate a perfect reproduction font, we can't actually ship it. I'd say a nice clear image is as good as you can reasonably get.
BTW, if you want cold sweats about font reproduction: Portal:Typography#Specimen_catalogs. Inductiveloadtalk/contribs 12:01, 30 December 2021 (UTC)
Yeah, no, I wasn't concerned with perfect fidelity: I'm not that geeky about fonts. But all those do have rough equivalents in modern computer fonts (most if not all available in free beer-ish variants) so in a perfect world… *sigh*
But, in any case, I meant the technical approach with the overlain transparent text to give cut&paste'ers and TTS systems something sensible to work with. Do $(".wst-iot-text").css("color", "red"); in your console to see what's going on. I may move this to {{iot}} ("Image of Text") if I decide it has sufficiently general applicability to these cases. Xover (talk) 12:34, 30 December 2021 (UTC)
@Xover oh I seeee, well it looks sensible enough to me. It works on export to an extent, at least: it works in Koreader, the text is reduplicated under the image in Moon+Reader. Inductiveloadtalk/contribs 14:02, 30 December 2021 (UTC)

Batch Scraping of Sims setsEdit

I've had my eye on All The Year Round for a while. From the experience with previous magazines, I'm not entirely sure if all the faffing about that results from trying to combine HT with IA. The IA has a complete Sims set of All the Year Round. However, it consists of 2,048 files. Major ouch. I was wondering if it would be possible to batch scrape the links. For every id, there is also entry_meta.xml where <volume></volume> maps to volume; <issue></issue> maps to issue or can be "CONTENTS" for Table of Contents; <date></<date> is a bit tricker because it can either be year-month-day or Text year (e.g. Christmas 1859). Do you think this is possible? Languageseeker (talk) 16:32, 26 December 2021 (UTC)

@Languageseeker you mean can you scrape the contents of that IA collection and construct the information for the upload (ie the same data as presented in the spreadsheets) from that? If so, it should be possible. The real question is are the SIM sets going to give us what we want, or would we rather prefer other sources and backfill from the SIM data when needed? Inductiveloadtalk/contribs 23:04, 26 December 2021 (UTC)
Yes, in essence, it would take the transform the 2,048 into one excel file that can then manually be verified and fill out. I've thought quite a bit and we're basically dealing with SIMS vs Google scans in many cases. The Smart Set sims scan show that OCR produces excellent results which is all that matters. The SIMS have more background noise, but the Google have more aggressive reduction that can obsucure or remove parts of the images. The sims scans are also more likely to be complete. I think that in almost all the case, the SIMS set might actually produce better results with less scan repairs needed. Honestly, SIMS will work and save an enormous amount of labor from having to track down 2,048 issues.
BTW, it seems that The Smart Set sims batch upload failed at 80:4. Languageseeker (talk) 23:26, 26 December 2021 (UTC)

How we will see unregistered usersEdit

Hi!

You get this message because you are an admin on a Wikimedia wiki.

When someone edits a Wikimedia wiki without being logged in today, we show their IP address. As you may already know, we will not be able to do this in the future. This is a decision by the Wikimedia Foundation Legal department, because norms and regulations for privacy online have changed.

Instead of the IP we will show a masked identity. You as an admin will still be able to access the IP. There will also be a new user right for those who need to see the full IPs of unregistered users to fight vandalism, harassment and spam without being admins. Patrollers will also see part of the IP even without this user right. We are also working on better tools to help.

If you have not seen it before, you can read more on Meta. If you want to make sure you don’t miss technical changes on the Wikimedia wikis, you can subscribe to the weekly technical newsletter.

We have two suggested ways this identity could work. We would appreciate your feedback on which way you think would work best for you and your wiki, now and in the future. You can let us know on the talk page. You can write in your language. The suggestions were posted in October and we will decide after 17 January.

Thank you. /Johan (WMF)

18:14, 4 January 2022 (UTC)

proofreadpagesinindexEdit

FYI, I submitted this task T298848. Mpaa (talk) 22:54, 9 January 2022 (UTC)

@Mpaa thanks, I was literally just doing it. Patch pushed. Looks more sensible now, hopefully we can get it into this weeks train. Sorry about that. Inductiveloadtalk/contribs 23:03, 9 January 2022 (UTC)

Validation neededEdit

My list of proofread pages is growing, but not enough validation is happening:

Any help would be much appreciated. -- Valjean (talk) 15:57, 11 January 2022 (UTC)

Sorry, I'm not really good at validation, and I don't have time at present anyway. As I said on your talk page, you should not take non-immediate-validation of proofread pages to be an insult and you should not expect that anyone does it as a precondition of your continued contribution. That way lies you burning out and leaving Wikisource (this happens). It just means no-one else feels like doing it at this time. Since proofreading is substantially more popular than validation on pretty much every Wikisource, this is rather the rule than the exception, and it's entirely possible that any given proofread page will go a very long time before being validated. I suggested you can nominate to the Monthly Challenge if you'd like assistance, which may include validation. It also may not, depending on how people feel, lots of works expire from the MC without validation, and some don't even get fully proofread) but it's probably more likely.
Remember that everyone is here by choice, and it's a fundamental part of the Wikisource model that it's completely up to other if they wish to validate something that you yourself chose to proofread in the first place. There is no expectation of a quid pro quo, unless you find someone to make some kind of validation pact with. Inductiveloadtalk/contribs 16:09, 11 January 2022 (UTC)
Okay. It's good to understand how things work here. -- Valjean (talk) 16:12, 11 January 2022 (UTC)
@Valjean by any chance have you got experience of PG Distributed Proofreaders? Sometimes recent converts find the less...structured...workflow here to be disconcerting, though I personally could not imagine enjoying working in that environment and find the lack of urgency the best bit of Wikisource. Inductiveloadtalk/contribs 16:26, 11 January 2022 (UTC)
No, I don't. -- Valjean (talk) 16:29, 11 January 2022 (UTC)

API for labels?Edit

Hi. Is there a PP API to get page labels in index? If not, would it be possible to add it to proofreadpagesinindex? Mpaa (talk) 14:17, 16 January 2022 (UTC)