Talk:An Etymological Dictionary of the Norn Language in Shetland

Latest comment: 3 years ago by Griceylipper in topic Abbreviations

Style edit

I'm very new to Wikisource, however I have been trying to make some decisions about how best to transcribe this admittedly very complex first project. For anyone else who wants to take a stab at transcribing some pages, here is what I have found so far.

Things I have decided:

  • I would like to effectively combine both volumes into a single text that contains the full alphabet, rather than developing two separate texts, one for each volume. Considering the way the volume II's page numbers start where volume I's ended, I'm pretty sure this would have been the ideal way of having it done, were it not for the cost of producing thick volumes. The display of the front matter may need to be altered slightly on the title page to take this into account.
  • Curly single and double quotation marks - please keep them curly (as they already are in the OCR). It makes italicising and emboldening text with these characters much easier to reason about. I don't think there's really any argument that including these will make someone who can't support Unicode unable to read the document, as it would be impossible to represent pretty much all other aspects of the text without Unicode.
  • The dictionary is columnised - don't preserve the columns. Edit: Preserve the columns in Pagespace, but <noinclude> them.
  • I have been preserving the line breaks at the end of each line, in order that it is easier to transcribe, and easier to proofread, as the heavy amount of formatting tends to make any large block of text without line breaks pretty incomprehensible otherwise. I'd appreciate it if other transcribers also did this, at least until the bulk of the transcription and proofreading is done.

Transcribers & proofreaders beware:

  • The PDF preview is very low res, such that it is currently insufficient for transcription of the diacritics on many of the letters - I have up until now been opening the PDF file in another window and referring directly to that. If someone knows if there is a way around this, I would appreciate having the previews made higher-res to avoid having to do this.
  • For some reason, when you go to set up a new page, often the auto-generated OCR transcription appears to be messed up as the two columns have been combined into one - i.e., it starts with column 1 line 1, followed column 2 line 1, column 1 line 2, column 2 line 2, etc. This can be easily remedied by referring to the original PDF document and copying and pasting the columns in one after the other.
  • For those not used to the phonetic symbols prevalent throughout the text, some are deceptively similar to ordinary Latin letters and can be easily missed - such as i & ı, for example. Be very careful when proofreading these.
  • Many words are highlighted by spacing them out like this. This can be difficult to spot in the original PDF - especially considering the rest of the column is fully justified - however I think it is very important to capture this in the transcription, as it highlights other entries in the dictionary. I have been using:
{{letter-spacing|0.08em|your text here}}
as a default spacing for these terms.
New template! It's now:
{{nornsp|your text here}}
  • Some terms when first specified in bold are also variously italicised depending on their origin, and some have different parts of the word italicised or not. This carries actual meaning to the reader, so this is important to watch out for.
  • Some of the symbols regularly used are difficult to find in the Special characters toolbar - e.g. these vowels - ȯ, ɔ̇, ꜵ̈ - the last of which doesn't even render properly without specialised fonts AFAIK. I have been keeping the Characters page open in another tab to be able to copy and paste the ones I need quickly.

Undecided:

  • Em dashes seem to be used in ways that I would think they normally wouldn't be. I have thus far been spacing them out as they are in the original text, as no other punctuation seems to have extra spaces like you sometimes see in older documents - presumably this means the spacing of em dashes is by design. If a consensus objects, hopefully it shouldn't be too difficult to Ctrl-F and replace either the spaces, or the dashes.
  • The one phonetic character I'm a bit unsure about the proper transcription is visible here in Character section - Jakobsen compares all these consonants without, and then with, cedillas ogoneks palatal hooks. One of them is "g", which I have up until now been transcribing as "ɢ̧", however I'm wondering if this is meant to just be a very fancy single-storey g? Or something else entirely? If someone has knowledge of what this is, please let me know so I can replace any wrong characters I have put in thus far
    • Edit - I think this should actually be ꬶ, didn't spot this symbol the first time I looked. I have since replaced all instances of the old transcription with this symbol.
  • I'm not sure if having a separate page set up for each individual dictionary entry would be beneficial? I'm not sure if there are any other fully-fledged examples to work off of as an example, however I was thinking to just set up pages for each letter of the alphabet for now and I'll see how it goes.
  • Ideally it would be best to support cross-reference links between terms. Someone might be able to advise me on what the best way to set this up with anchors would be to avoid any headaches down the line. I don't know if that would depend on the setup of the point asked above?
  • Due to the age of the publication, AFAIK the phonetic descriptions for each term use an alternative phonetic system prior to the formalised IPA, however, I presume seeing as it is phonetic (i.e. 1 character represents 1 sound), that some sort of template could in theory be set up with this old-style phonetics as its input, and could generate modern IPA that could be displayed when hovered over? A feature like this would vastly increase the usefulness of this text as a resource. In principle, would this be possible?

I apologise if any of this is blindingly obvious to any more experienced editors here - I am new :) Anyway, that's all I can think of for the moment. Feel free to chime in on anything you think is important.— 🐗 Griceylipper (✉️) 04:09, 27 May 2020 (UTC)Reply

Abbreviations edit

As with most dictionaries, a large proportion of the descriptions for each entry is made up of a soup of abbreviations, which, to the casual reader, can be very difficult to parse without having memorised the contents of the abbreviations page at the beginning of the text (or, continually referring back to it). However, of course there is Template:Abbr, which can create tooltips to aid the reader work out what's being said, e.g..

Presumably if you wanted, you could use this template (or a custom one set up specifically for this text to simplify things a bit) to create tooltips for every single abbreviation in the text. While I, as a reader, would appreciate this, I understand that some readers may be put off by a sea of dotted lines underneath all the abbreviations. But likewise, a reader may be equally be put off by a sea of abbreviations that may make no sense to someone who isn't familiar with the abbreviations. Looking to Wiktionary as an example, they seem to spell out every term in full by default (not being limited to the space requirements of printing the text in books). But obviously this is the ideal way to do it if you have the room - which this transcription does.

As an example, I have set up page 1 of the dictionary with (as far as I can see) every possible abbreviation set up with a tooltip. I'm using the normal Template:Abbr for the moment, however I think I could make the source a bit more terse (and a bit less error-prone) with a custom template.

My question is, assuming I want to make the transcription as useful as possible (and assuming I have unlimited time to spend additionally wrapping every abbreviation), are there any abbreviations I should not set up with tool-tips? Would I just stick to every first instance of it in a particular page / chapter? Would I do some of the less obvious abbreviations, but leave out more common ones?

My initial thinking is, if I surrounded all potential abbreviations with a custom template specifically for this text, it would be reasonably simple to tune the number of tooltips displayed by editing the template, rather than having to trawl through the text every time a change was desired.

If anyone has any thoughts, let me know.— 🐗 Griceylipper (✉️) 05:55, 28 May 2020 (UTC)Reply

Edit: I have gone ahead an implemented it, go ahead and use Template:Nornabr anywhere you see an abbreviation, as long as it's not in reference to an author or work. Any other abbreviations (general dictionary terminology, placenames - even those including superscripts) are fair game. If it throws an error, check your usage of full stops in / out of the template. If what you've input matches the source and it still creates an error, feel free to add more items to the list of Template:Nornabr/switch

If the resulting text is deemed too heavy on tooltips, it can be tuned on the template end.— 🐗 Griceylipper (✉️) 04:50, 30 May 2020 (UTC)Reply