Please do not post any new comments on this page.
This is a discussion archive first created in October 2009, although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

Announcements

Links to pages from index pages

I have removed the technical part from index talk pages and rc links from index pages. The reason is that you can find these links now directly in the "Related changes" link of the "Toolbox" in the left of the screen, so these technical parts aren't needed any more. --Zyephyrus (talk) 13:04, 24 September 2009 (UTC)

Nomination for CheckUser

Recently I nominated to be a CheckUser for English WikiSource (enWS). Editors are invited to express their opinion (be it negative or positive) at Wikisource:Administrators#Billinghurst. -- billinghurst (talk) 05:12, 9 September 2009 (UTC)

Wiktionary Hover gadget in beta

Birgitte identified a tool in use at WikiNews that she believed useful for WS, so I have installed it into a new Development section in Gadgets, and it directly calls the development tool at News. Feedback is being collected at MediaWiki talk:Gadget-dictionaryLookupHover.js and developer is User:Bawolff, ably hassled by User:Amgine. A consequence is that the [EDITING] Edit pages on double click (requires JavaScript) now requires a shift double click to function. -- billinghurst (talk) 04:10, 27 September 2009 (UTC)

Proposals

Restructure of Wikisource:Featured texts

Current condition

Currently Wikisource:Featured texts features only one Text per month and displays that the whole month on the Main page. This is something more like a Wikisource:Featured text of the Month and not just Featured text. A Featured text is expected to represent the highest quality and one of the best texts on Wikisource and there is more than one text that is so each month. Wikisource:Featured text candidates has currently many texts that are of high quality and have the support of many users to receive such a certification, but only one of them will.

Suggestion

I therefore suggest the distinction between two kinds of featuring texts like on Wikipedia. One for featuring the text on the main page and another for certifying that a text is of high quality and value. Multiple texts can become featured in the same month and obviously one of the featured texts can become a featured text of the month. A nominator doesn't have to wait a month for his text to be featured but maybe only 14 days with a 2/3 support vote.--Diaa abdelmoneim (talk) 15:41, 29 September 2009 (UTC)

Comments

Removal of alternative language list from article PDFs

Current condition

The article PDF generator currently excludes (through the use of either a category or a subtemplate, I can't remember which) navigation templates from PDFs generated using the sidebar option. However what in the article is a list of links to alternate-language versions of the document (see for example Charter of the United Nations) becomes in the PDF version an unsightly plaintext (although hyperlinked) listing of article names within a single paragraph.

Suggestion

I propose that through whatever method is appropriate the list of alternate languages be excluded from generated PDFs. In its current form it sserves only to detract from the appearance of the PDFs: it should either be removed entirely (the better option in my view), or be restructured into a properly-formatted listing rather than its current stream-of-consciousness paragraph state.

Comments

Other discussions

Index popularity

I was thinking it would be great to know which of our indexes are most commonly visited by newcomers to the site, what is "drawing people in", so to speak. Then, we know what areas are not necessarily worth the work to improve, and which are. (And "improve" does not just mean "add more texts to Wikisource:Erotica", it means formatting, templates, images, cleaning up the area to look presentable).

So without further ado, the number of page views in July, for some random indexes;

Wikisource:Works - 3,829 views
Wikisource:Erotica - 586 views
Wikisource:Christianity - 506 views (Wikisource:Islam is comparable)
Wikisource:Song lyrics - 200 views
Wikisource:Sexuality 164 views
Wikisource:Guantanamo - 153 views (almost all our texts there are contributed by a single editor)
Wikisource:Wicca - 109 views (Buddhism is comparable)
Wikisource:United Nations - 96 views
Wikisource:Mermaids - 96 views (Wikisource:Pirates is comparable)
Wikisource:Pulitzers - 96 views
Wikisource:World War II - 86 views
Wikisource:Law - 66 views
Wikisource:Chess - 47 views
Wikisource:Canada - 39 views
Wikisource:Pennsylvania - 4 views (Other states seem to be comparable)

What I infer from some of these numbers, is that things like WW2 and the UN actually draw comparatively little viewers, compared to things like collections of poems and stories about mermaids. In addition, erotica and sexuality are both highly-sought categories of works; people come here looking for Public Domain texts that are titillating (and probably cost money to download elsewhere). Our collection of legal cases seems to be woefully uninteresting to the general public (which surprises me, actually). Finally, I notice that while the success of Islam and Christianity isn't surprising, people seem to view even Wiccan and Buddhist texts surprisingly often.

This kind of unofficial study can help direct us on things like Collaborations (no need to collaborate on New York or Sierra Leone, instead focus on improving our collection of Buddhist texts, etc). Thoughts? Comments? Concerns? Sherurcij ^{Collaboration of the Week: Author:Galileo Galilei.} 05:45, 27 August 2009 (UTC)

I'm pretty sure those 586 views were all me.

But seriously, below a certain threshold, these view counts will be affected more by editors than by readers.

Hesperian 06:06, 27 August 2009 (UTC)

Do the indexes matter? I've rarely gone to them; if everyone links in from Google or Wikipedia, or uses the Search box, the index numbers may have little to do with any real interests of our audience.--Prosfilaes (talk) 10:42, 27 August 2009 (UTC)

Judging from the disparity in numbers, they do seem to matter; 500 people a month click the link to Wikisource:Erotica that appears in the header of our erotic works, while almost nobody finds themselves on the National pages. Even if they are linked in from Google or a Search Box, it is evidence of what people are searching for...erotic and religious texts, more so than legal cases. Sherurcij ^{Collaboration of the Week: Author:Galileo Galilei.} 17:21, 27 August 2009 (UTC)

But the indexes probably aren't linked in from Google or a search box. If you're looking for Roe v. Wade, you're more likely to find this page by hitting w:Roe v. Wade and then clicking on the Wikisource link. So court cases are likely to be underrepresented in your search. On the other hand, erotica is likely to be over-represented, as most of it doesn't have Wikipedia articles and people are less likely to be looking for a particular piece by name. Any case, 500 people a month don't click the link; you said 500 views, and one person on a serious dig through Wikisource might go back to that page several times in one day, or continue on over several days; or as per Herperian, one editor editing a page could easily generate 100 views in a month. I think we need numbers besides indexes to gauge whether they really matter to our users at all.--Prosfilaes (talk) 22:51, 27 August 2009 (UTC)

Some internet searching led me to THEwikiStics, which has most-viewed page data for the last year, month and day. It gives relative weight to legal/political, religious and literary content, and generally restores my faith in mankind. --Eliyak T·C 03:31, 28 August 2009 (UTC)

Works Popularity

Like Sherurcij, I was wondering what our most popular attractions to Wikisource were through its most popular works, to see what was drawing people in. Then we could see which areas are worth our while to emulate for the sake of popularity, if possible, and which are absent from that class. So I, too, have compiled an incomplete list of works with the most page-hits (more than 340,000 as of May 1, 2009) on Wikisource:

Industrial Society and Its Future by Ted Kaczinsky: 1,420,000 ~~hits~~
(English Wikisource main page: 1,256,000 ~~hits~~)
Zodiac Killer letters: 572,000 ~~hits~~
Beowulf/Glossary: 563,000 ~~hits~~ by James A. Harrison and Robert Sharp
The Confessions of William-Henry Ireland: 518,000 ~~hits~~
Troublous Times in Canada: A History of the Fenian Raids of 1866 and 1870 by John A. Macdonald: 513,000 hits
(Wikisource:Scriptorium: 495,000 ~~hits~~)
Das_Kapital/Chapter 15 by Karl Marx: 468,000 ~~hits~~
Bible (World English)/Luke: 466,000 ~~hits~~
Bible (World English)/Matthew: 445,000 ~~hits~~
The True Story of the Vatican Council by Henry Edward: 441,000 ~~hits~~
Rights of Man by Thomas Paine: 425,000 ~~hits~~
The Tragedy of Titus Andronicus by William Shakespeare: 409,000 ~~hits~~
The Book of the Thousand Nights and a Night/Volume 13 translated by Richard Francis Burton: 379,000 ~~hits~~
Papyrus of Ani translated by E. A. Wallis Budge: 349,000 ~~hits~~
Bible (American Standard)/Genesis: 341,000 ~~hits~~

Well it looks like true crime stuff like you see on A & E, plus things religious and public high school and undergraduate college teachers would assign as reading material and what their students would choose as sources for research papers. ResScholar (talk) 11:07, 28 August 2009 (UTC)

Woohoo, two of the top five are my own! (Does anybody else find it ironic that people read an anti-technology manifesto...online?) Sherurcij ^{Collaboration of the Week: Author:Galileo Galilei.} 03:57, 29 August 2009 (UTC)

Sad discovery

It turns out Sherurcij is not the winner after all. I apologize for the confusion. Although the Zodiac letters are still near the top, what those numbers mean are not the number of web-page hits, but the number of bytes downloaded from the web-page in the course of the hits that that page received between 1 and 2 a.m. UTC on May 1. The number of bytes downloaded may be larger than the web page size due to revision data that gets downloaded as well, or it may be smaller because you didn't want to wait to download the whole page and clicked through to something else. The rankings that are displayed on Eliyak's yearly Wikistics link above are probably your best bet for a top 15 or greater works popularity list, even though it weeds out popularity spikes [of works that get less than three hits a day, but retains those others that spike for whatever reason]. Again, sorry for the mistake. ResScholar (talk) 05:33, 30 August 2009 (UTC) (bracketed material added ResScholar (talk) 04:44, 1 September 2009 (UTC))

Works Popularity redux

Here is a real list of the works pages with the most hits over the period July 2008 to June 2009. I thought it would be interesting to look up the user who originated the page on Wikisource as well.

Work, hits over span of one year		User contributor, date of last edit
1. Industrial Society and Its Future, 143,000 hits	---------------------	1. User:24.12.50.240, 18 July 2004
2. Additional amendments to the United States Constitution, 135,000 hits		2. User:24.87.43.26, 30 May 2006
3. Constitution of the United States of America, 133,000 hits		3. User:Angela (also gives prior heritage on PS), 22 Jan 2008
4. United States Bill of Rights, 109,000 hits		4. User:24.87.43.26, 30 May 2006
5. Barack Obama's Iraq Speech, 67,500 hits		5. User:Cneubauer, 30 May 2008
6. Washington's Farewell Address, 64,500 hits		6. User:Mulad, 14 June 2005
7. The Science of Getting Rich, 61,000 hits		7. User:Guanaco, 18 September 2004
7. Zodiac Killer letters, 61,000 hits		7. User:Sherurcij, 25 September 2009
9. United States Declaration of Independence, 58,000 hits		9. User:Kalki, 1 April 2009
10. Because I could not stop for Death —, 54,000 hits		10. User:DonQuixote, 20 January 2008
11. The Raven (Poe), 50,500 hits		11. User:JeLuF, 19 September 2008
12. I have just been shot, 49,500 hits		12. User:AllanHainey, 27 January 2009
13. Gonzales' Resignation Letter, 45,500 hits		13. User:John Vandenberg, 25 September 2009
14. Barack Obama's Inaugural Address, 43,000 hits		14. User:Mickster810, 20 January 2009
15. Gettysburg Address, 42,000 hits		15. User:81.187.43.178, 24 November 2003
16. Kama Sutra, 39,000 hits		16. User:Yann, 2 October 2009
17. Constitution of the Philippines (1987), 31,500 hits		17. User:203.87.234.114, 25 March 2006
18. The Call of Cthulhu, 28,500 hits		18. User:Eequor, 28 September 2004
19. AK-47 Operator's Manual, 28,000 hits		19. User:RavenStorm, 17 July 2007
20. Bible (King James)/Matthew, 27,000 hits		20. User:Pwd, 26 May 2004
20. Bible (King James)/Genesis, 27,000 hits		20. User:Ashley Y, 18 September 2004
20. Bible (King James)/Isaiah, 27,000 hits		20. User:Pwd, 26 May 2004
23. An Autobiography or The Story of my Experiments with Truth, 27,000 hits		23. User:Yann, 2 October 2009
24. Remarks of Senator Barack Obama on New Hampshire Primary Night, 26,500 hits		24. User:BD2412, 28 September 2009
25. The Curious Case of Benjamin Button, 26,000 hits		25. User:Netoholic, 20 October 2006
26. If—, 25,500 hits		26. User:Morwen (also gives prior heritage), 31 May 2006
27. Modern Money Mechanics, 25,000 hits		27. User:Diego pmc, 4 April 2009
28. Bible (King James), 24,500 hits		28. User:Ashley Y, 18 September 2004
29. Pyramus and Thisbe, 24,500 hits		29. User:Gatewaycat, 3 May 2007
30. The Charge of the Light Brigade, 24,000 hits		30. User:24.217.210.24, 9 June 2004
31. Open Letter To Tarja Turunen, 23,500 hits		31. User:81.207.18.101, 5 December 2005
32. Catullus 16, 23,000 hits		32. User:Sophysduckling, 9 July 2007
32. Democracy and Education, 23,000 hits		32. User:AaronSw, 8 January 2009
34. Elegy Written in a Country Churchyard, 22,500 hits		34. User:TheoClarke, 18 September 2009
35. Bible (King James)/Revelation, 22,000 hits		35. User:Pwd, 26 May 2004
36. Daphne and Apollo, 21,500 hits		36. User:Gatewaycat, 3 May 2007
36. Speech - Global Warming: a Time to Act, 21,500 hits		36. User:Wabbit98, 4 November 2007
36. Constitution of Malaysia, 21,500 hits		36. User:Tansm, 5 March 2009
39. Meditation XVII, 21,000 hits		39. User:MeltBanana, 24 November 2005
40. The Star-Spangled Banner, 20,500 hits		40. User:Mulad, 14 June 2005
41. The Tale of Peter Rabbit, 20,500 hits		41. User:Mattderojas, 3 September 2009
41. Catholic Encyclopedia (1913), 20,500 hits		42. User:Illy, 29 July 2006
43. Annabel Lee, 20,000 hits		43. User:PaulinSaudi, 28 August 2005
44. Ron Paul's Iraq Speech, 19,000 hits		44. User:FlatterWorld, 1 April 2008
45. The Complete Works of Swami Vivekananda, 18,500 hits		45. User:Nvineeth, 3 October 2009
45. Toleration of the corset, 18,500 hits		45. User:Haabet, 6 August 2009
45. Bible, King James, Psalms, 18,500 hits		45. User:Pwd, 26 May 2004
48. Dulce et Decorum Est, 18,000 hits		48. User:Khaldei, 4 April 2008
48. You Are Old, Father William, 18,000 hits		48. User:202.63.174.250, 31 July 2006
50. The Science of Getting Rich/Chapter 1, 17,500 hits		50. User:Guanaco, 18 September 2004
51. Body Ritual among the Nacirema, 16,500 hits		51. User:209.188.70.230, 15 October 2004
51. 95 Theses, 16,500 hits		51. User:Silvermane, 5 August 2004
51. Strange Meeting, 16,500 hits		51. User:Ardonik, 22 September 2004

According to Wikistics, English Wikisource got 90,400,000 hits over the course of the 363 days examined.

ResScholar (talk)08:00, 25 September 2009 (UTC); 05:09, 26 September 2009 (UTC); 04:54, 3 October 2009 (UTC); 07:52, 3 October 2009 (UTC); 02:11, 4 October 2009 (UTC)

Questions

Transclusion of footnotes over different pages

Is there a method for transclusion of footnotes over multiple pages? // Wellparp (talk) 19:42, 5 July 2009 (UTC)

The used methodology is to include all of the continued footnote on the first/originating page of the footnote, and not to have in situ on each page. Each time that I undertake that, I will include a  in the footer of each page. On the originating page to say that footnote continued on succeeding pages and concatenated here. On succeeding pages that the footnote is concatenated on previous page where it originated. Examples

Page:The History of the Church & Manor of Wigan part 1.djvu/153 and in use The History of the Church and Manor of Wigan/Edward Fleetwood

While it differs in the Page: namespace, when we transclude to main namespace all works out well. -- billinghurst (talk) 23:04, 5 July 2009 (UTC)

I like this method, and think it should become a standard or recommended policy when dealing with scanned works. Should we record it somewhere other than the Scriptorium? I’m just thinking about what should happen when this heading ages its way into the Archives. Tarmstro99 (talk) 16:16, 27 August 2009 (UTC)

There is another method; it displays the bundled footnote at the first page, using sections, but keeps the overflowing text with the subsequent page scans. Cygnis insignis (talk) 18:13, 27 August 2009 (UTC)

I prefer the method Cygnis is referring to; it's in place on much of the United States Statutes, such as An Act Concerning Aliens, pages 570 and 571. It's a bit harder to code, but the text appears on the correct page. --Spangineer^wp (háblame) 22:20, 27 August 2009 (UTC)

I hope Cygnis doesn't mind, but I altered his example so that the overflow does not display in the original page text, but does when transcluded. This seems an ideal solution. I bet it could be consolidated into a template. --Eliyak T·C 19:32, 2 September 2009 (UTC)

I have managed to convert part of the code to a template, but the part which belongs on the overflow pages seems to be stubborn and not want to work when templatized. --Eliyak T·C 21:02, 2 September 2009 (UTC)

Maybe it could be useful to see Cygnus's method roughly applied on it.source:

Template:Pt performs this magic: first parameter=what you want displayed on namespace Page; second parameter=what you want displayed on ns0 when the page is transcluded
When a footnote is divided between two pages:
- in the first you write <ref>Jibber jabber jibber (=beginning of the note) {{pt||{{#ls:NameOfTheFile.djvu/NumberOfTheNextPage|section=overflow}} }}</ref> then
- in the second, after the text, ou write a <section begin=overflow />jibber jabber (=end of the note).<section end=overflow />
When pages are transcluded the two halves shall be joined.

This method is clumsy but can be used aldo when notes run along more than two pages (you can see an impressive example here, transcluding a chapter where huge footnotes span along three or four pages.

Hoping to be useful let me know if you liked this approach and if it can be made simpler. - εΔ ω 15:17, 5 September 2009 (UTC)

I finally figured out an elegant solution to this—more elegant than the above I think.

As with the above, use <section> tags to separate actual page text from footnotes that have overflowed from previous pages. On the page that contains the overflowing footnote, use {{#section:}} to include the overflowed text. Wrap it in <includeonly> so that the overflow appears when the page is transcluded but not the page itself is viewed.

I and several others have got to this point in the past and been stymied by the fact that <noinclude> tags don't work inside <ref> tags. I discovered the solution when answering the #Reference inside reference question below: the outermost ref tag must be replaced by a call to {{#tag:ref}}.

Here's some cheat code:

For the first page:

{{#tag:ref|Footnote text before overflow.<includeonly>{{#section:Page:[Second page]|overflow}}</includeonly>}}

On the second page:

<section begin=text/>Page text<section end=text/><section begin=overflow/>Footnote overflow.<section end=overflow/>

Transclusion:

{{Page:[First page]}}
{{Page:[Second page]|section=text}}

Hesperian 02:54, 24 September 2009 (UTC)

New to DJVU

I'm trying to get a copy of The Jewish Manual (cookbook by Judith Montefoire) into a side-by-side setup. I've uploaded The Jewish Manual.djvu to commons, but when I try to create the index at Index:The_Jewish_Manual.djvu, I get an error. Any ideas? --Eliyak T·C 05:27, 2 August 2009 (UTC)

DjVu file is broken. ... The_Jewish_Manual.djvu‎ ( × pixels, file size: 1.64 MB, MIME type: image/vnd.djvu) Which site, or what software did you use to convert the file? -- billinghurst (talk) 08:57, 2 August 2009 (UTC)

This doesn't necessarily mean there is a problem with the file. I've seen so many of these lately I reopened a bug on it.[1] Hesperian 11:41, 2 August 2009 (UTC)

Hesp, I am wondering whether there is an issue at djvuzone.org as I have had a few files have that problem, whereas they are fine when done using new version of PDFtoDjVu billinghurst (talk)

After converting the file, I guess that it was converted at Any2DjVu. I have found a ready copy at archive.org, and am uploading it again.-- billinghurst (talk) 11:46, 2 August 2009 (UTC)

I used Any2Djvu to convert, then removed Google's first page using djvm. The file was readable on my computer before I uploaded. Perhaps the problem had to do with the embedded OCR option?

Thanks for your help. --Eliyak T·C 16:52, 2 August 2009 (UTC)

I've also had problems with Any2DjVu. Maybe we need to file a bug with mediawiki. John Vandenberg ^(chat) 00:27, 10 August 2009 (UTC)

Same issue here. DroEsperanto (talk) 08:24, 9 September 2009 (UTC)

this bug is mine. it comes from invalid uft8 in the text layer. I fixed it, but the fix is not yet active. files from archive.org do not have this problem. ThomasV (talk) 09:36, 9 September 2009 (UTC)

So I can just download that one and replace the file on Commons? Currently I'm getting the same x pixels thing and when I try to make an index page it says "Error: no such file" for the "pages" preview; is that all part of that error, or is that something different? DroEsperanto (talk) 10:46, 9 September 2009 (UTC)

Never mind, the archives.org version seems to be working. Thanks! DroEsperanto (talk) 10:55, 9 September 2009 (UTC)

it should work now ThomasV (talk) 09:17, 17 September 2009 (UTC)

Header-template

Do you have any template for this kind of header? -- Lavallen (talk) 06:46, 25 August 2009 (UTC)

No. We have Template:Running header, but that only handles left, centre and right. For this I think you would need a table:

{|style="width:100%"
|-
| width="5%" | 30
| width="40%" align="center" | Roahs eftertommande
| width="10%" align="center" | Genesis.
| width="40%" align="center" | Cap 10.
| width="5%" align="right" | 
|}

30

Roahs eftertommande

Genesis.

Cap 10.

It would be easy to make it into a template if there was demand for it. Hesperian 07:00, 25 August 2009 (UTC)

Thank You! I think I can make a template according to this... -- Lavallen (talk) 07:11, 25 August 2009 (UTC)

Can I suggest that we look to convert this into an additional running header for WS too. So far we haven't had four part header, though surely it is a matter of time. Preferably I would prefer to see RunningHeader have both parameters and the ability to chose which, though if that is too hard, then something akin to it and suitably linked would be a good second best. billinghurst (talk) 04:45, 30 August 2009 (UTC)

.pdf Hosting

Tom W. Sulcer, an author, wants to host his published book Common Sense II here at Wikisource as a .pdf file. Is this permitted here? ResScholar (talk) 04:44, 27 August 2009 (UTC)

I think that PDF files should be hosted on Commons. Yann (talk) 11:02, 27 August 2009 (UTC)

If the work is under a free license, it can probably be added as a regular article. PDFs can be automatically created from any article. If it is a long work that would be split into multiple pages or has special formatting, Uploading to Commons would probably be best. --Pmsyyz (talk) 23:57, 27 August 2009 (UTC)

I'm with Yann here. PDFs aren't nice formats for wikis, as they're not easily editable or reusable. I assume this is the better type, that's not just scanned pages or text stored on the page with no respect to the logical structure of the document, but still. And they're pains for pretty much any reader who doesn't print them out and nigh-impossible on small readers like even the fanciest cell phone.--Prosfilaes (talk) 02:21, 28 August 2009 (UTC)

My first preference is for DjVu over PDF, and would prefer that format considered primarily. With regard to where it is hosted, I prefer to see it at Commons, though understand that there are times when files cannot be hosted there, so then use WS as the fallback place for an upload. billinghurst (talk) 04:42, 30 August 2009 (UTC)

unifying wikipedia and wikisource legal citation templates (usc and etc)

Dear Wikisource

I needed Template:UnitedStatesCodeSub for something im working on (it's under User:Decora at the moment). So.. i copied the template over from wikipedia.

I happened to notice that Template:UnitedStatesCode on wikisource actually differs from w:Template:UnitedStatesCode on wikipedia... (wikisources' "UnitedStatesCode3 seems to be the same thing as wikipedia's UnitedStatesCode... wikipedia having somehow combined the two while maintaining backwards compatibility..) and that on top of that... there are several legal citation templates from wikipedia copied to wikisource... but not all of them.

It seems like, in making 'annotated' documents (particularly ones that refer to law), it would be quite helpful to have some good citation templates (and it seems that wikisource already has several).

But it also seems like it would be nice to have them in 'sync' with those on wikipedia, which are well developed and well tested already. Also it seems like it would be a nightmare to have templates named the same thing on wikipedia and wikisource, but to have them behave differently. Over time, it would become 'stuck', because wikisource will have many articles depending on it working one way, while wikipedia will have articles depending on it working the other way.

Does anyone know the plan here? Thank you. Decora (talk) 09:52, 29 August 2009 (UTC)

I'd be interested in resolving the disparity between the two sites since I'm likely to call upon this "family" of templates as well. George Orwell III (talk) 20:21, 29 August 2009 (UTC)

I know of no plan, though do feel that while this is a lovely idea, it is also one in which we need to be mindful. I recently imported a template, and it had other base templates that we used and they were overwritten[2], and have caused a few hiccoughs. I don't know of a good solution, though I would like to see one in place! -- billinghurst (talk) 04:38, 30 August 2009 (UTC)

FWIW... I took a stab at trying to fix the listed-but-broken ones as best as I could. Overall, they seem to be better off than before I started and even manage to add some bits and pieces in the process too -- but that would mean I know what I'm doing which, believe me, is NOT the case. George Orwell III (talk) 22:48, 2 September 2009 (UTC)

Popular Science Monthly Project

Hey Everyone,

I just finished uploading the reminder of the text for the first 92 volumes of Popular Science Monthly with exception of volume 75. I posted a note on the Internet Archive forum hoping that they might be able to upload this text. I'll let you know what I find. Anyways, there is a lot of proofreading to be done. There is a lot of good material there. I hope everyone enjoys. Feel free to join the project. :)

--Mattwj2002 (talk) 21:32, 29 August 2009 (UTC)

Guidance for naming of poetry?

What is our naming conventions for poetry that is unnamed? A user is naming them after the first line of the poem, and putting the names inside double quotations. I am poetry-inept so seeking guidance from someone who knows better on a convention. At Help:Editing poetry we do not cover it, and it would be good if there is guidance that we add it to that page. Thanks. billinghurst (talk) 09:47, 31 August 2009 (UTC)

That's generally the standard practice, though I might drop the quote marks in the page title.--Prosfilaes (talk) 16:03, 31 August 2009 (UTC)

Dropping the quotations was the basis of my discussion with the editor, though didn't have a firm basis to go and make the argument. As it is more of a policy matter, I thought that bringing the matter here would be useful for POVs. -- billinghurst (talk) 23:41, 31 August 2009 (UTC)

I would drop the quotation marks too. A person looking for the poem is unlikely to know when he must begin his query with quotation marks. Eclecticology - the offended (talk) 07:20, 11 September 2009 (UTC)

Inclusion of non-fiction PD book published in 1966

I have a question whether this book qualifies for inclusion on Wikisource. "Origin of North Dakota Placenames" by Mary An Barnes Williams. Published 1966 by the Bismarck Tribune. It was published without a copyright notices, so meets {{Template:PD-US-no-notice}}. It is a non-fiction gazetteer type of book listing the evolution, establishment, and etymology of North Dakota cities, towns, and villages. Can it be added to Wikisource? I don't have an electronic version, but have a hard copy checked out from my local library that I would enter into Wikisource. Its addition would help members of the North Dakota Wikiproject on the English Wikipedia improve city and town articles.Dcmacnut (talk) 04:06, 5 September 2009 (UTC)

Sounds okay. Are you able to scan it? If so, that would allow for it to be proofread and works its way to being a featured text. If you can, then please have a look at Help:DjVu files. -- billinghurst (talk) 04:49, 5 September 2009 (UTC)

Scanning may be difficult, as the copy is fragile and when it was rebound the pages aren't aligned very good. Google Books has a scanned copy from the University of Virginia, but it's only "snippet view." I'm hoping that it will be freely available once the Google Books settlement is resolved for legitimate PD books. For now, a manual effort seems the only option.Dcmacnut (talk) 14:50, 5 September 2009 (UTC)

Best of luck sounds like the best response. I would ask that you consider to set up a paired project page, and organise the ND stuff here if you have a chance. We love it when there is the two way interlinks. -- billinghurst (talk) 00:36, 6 September 2009 (UTC)

The Google Books settlement has nothing to do with legitimate PD books; the reason why it's not available from Google is that it's non-trivial to prove that it was a legitimate printing (bootleg printings wouldn't start the clock), and that it doesn't contain material by non-US authors not first published in the US that would have been restored by the URAA. I don't doubt it's clear for us to do, but that decision would be a lot harder for Google to make in mass. So I wouldn't expect to see Google Books release it as a public domain book anytime soon.--Prosfilaes (talk) 01:05, 6 September 2009 (UTC)

Good point on the Google settlement. Google has always taken the conservative route on PD volumes. It's just frustrating, that's all. From what I'm hearing, adding this book may be more work than it's worth. This is a very topic specific reference for only a few dozen articles. I don't understand the concept of "paired projects" and mainly am looking for a central repository for storing the book so I don't have to keep checking it out. I'll have to come up with something else. Thanks for the advice, everyone.Dcmacnut (talk) 04:06, 6 September 2009 (UTC)

Category names

I notice lots of category names with "Wikisource:" at the front and lots of others without. What is the policy here?--Filceolaire (talk) 11:49, 6 September 2009 (UTC)

Categories or pages in the WS: namespace? An example would be helpful. Generally pages with Wikisource: are for collections of information about works or projects. With the main namespace being reserved for works. Sometimes works in a series would have a lead-in page in the WS: namespace, with each work having its own place in the main NS as it is an individual work. -- billinghurst (talk) 12:18, 6 September 2009 (UTC)

I would guess that Filceolaire doesn't know that we have a special meaning for the word "category". Hesperian 00:53, 7 September 2009 (UTC)

Mystery symbol

There is a symbol used twice on Page:On the cultivation of the plants belonging to the natural order of Proteeae.djvu/21 that looks a bit like an upside-down Ezh. For example, see a few words to the right of "Stylurus".

In this context the symbol ought to be a digit. Possibly the type was set upside down or something. I would declare it a 3 except that there are 3s on the page that don't look much like it. Does anyone have any ideas? Hesperian 00:53, 7 September 2009 (UTC)

I just noticed that at the bottom of the page "frutices" and "aemulis" are typeset as "frntices" and "aemnlis" respectively. Perhaps the typesetter was hungover that morning. Hesperian 01:24, 7 September 2009 (UTC)

I would go with your latter theory, or maybe his contact lenses were cloudy, or ... -- billinghurst (talk) 04:38, 7 September 2009 (UTC)

I wonder whether this 3 might or not be a 2, and somebody would have written the 2 with ink so that it could be more legible because it would have been too pale at first? --Zyephyrus (talk) 20:00, 7 September 2009 (UTC)

My thanks to both of you. Zyephyrus, I think you've put me on the right track. 90% of the symbol coincides with a two, or 95% if you take into account the fact that the ink has bled across the middle on all the other twos on the page. The only bit that doesn't match a two is the gap in the top right, but this could be due to a damaged piece of type. The most compelling evidence is the fact that it looks like a two when viewed from a distance; I had been zoomed right in and hadn't thought to zoom out and look at the overall jizz of it. Hesperian 23:55, 7 September 2009 (UTC)

Breakup

What does everyone think of giving each of the four headings in this page their own page? It would be helpful for archiving and would be quite simple. Arlen22 (talk) 14:58, 9 September 2009 (UTC)

On one hand, I'm not a fan of breaking up a low traffic page for a village pump. On the other hand, I hate trying to find a new post when it's not at the bottom of the page like it should be. If we're going to split it up, I think we should think about killing a heading or two before splitting.--Prosfilaes (talk) 16:00, 9 September 2009 (UTC)

Currently, the format seems fairly consistent across the discussion pages, so the same code can be used to archive all of them. Breaking up one page could make archiving more difficult, since the code may need a special case for that page. I think a 4-way split would be a bit much, since only one section gets significant traffic; splitting that section should be sufficient, if needed. -Steve Sanbeg (talk) 16:14, 9 September 2009 (UTC)

Your probably right, Steve. The question page could be taken and put in a new page called General. Arlen22 (talk) 18:22, 9 September 2009 (UTC)

As far as general discussion pages go this one does not get a lot of traffic. With the fourth section getting most of the traffic splitting it off from the others would likely result in changes to the others not being noticed at all. I would be quite happy to see the notion of four sections abandoned. Eclecticology - the offended (talk) 06:46, 11 September 2009 (UTC)

Annals of the World

Hello Everyone,

I am looking for ideas on what to do with James Ussher's Annals of the World? I just created it a couple hours ago (12:14 Wikitime to be exact). Some of the pages are incredibly long. Have any ideas? Arlen22 (talk) 18:26, 9 September 2009 (UTC)

Actually, they don't make it to the longest 5,000 list. I am not quite so worried. Arlen22 (talk) 21:25, 9 September 2009 (UTC)

Annals of the World is Done! Arlen22 (talk) 00:19, 10 September 2009 (UTC)

It took 12 hours & 5 minutes to do it! Arlen22 (talk) 00:21, 10 September 2009 (UTC)

Actually, I forgot some stuff which I added later. Pride goeth before a fall... Arlen22 (talk) 14:43, 11 September 2009 (UTC)

academic papers?

I'm surprised that there are very few academic papers available on Wikisource, other than those listed at Category:research articles. However, given the large number of papers out there, shouldn't there be a lot more whose copyrights have expired?

I think it would be a great benefit to Wikisource if we had more academic papers here.

Heck, maybe we could even convince some journals to release certain articles under a free license or something! --Ixfd64 (talk) 01:12, 10 September 2009 (UTC)

Are you talking about stuff like scientific papers stating new discoveries? If so I am not sure what to say. I think there could be a section for that. Actually, that would be a good idea! Like what do you have in mind? Arlen22 (talk) 01:35, 10 September 2009 (UTC)

Yes, that's exactly what I'm thinking of. --Ixfd64 (talk) 19:52, 10 September 2009 (UTC)

That would be excellent, we could make a Wikiproject out of it. How about Wikisource:Wikiproject Science papers ? Got any other suggestions? Arlen22 (talk) 14:41, 11 September 2009 (UTC)

Why not just make a Wikiproject for academic papers with individual subprojects for philosophy, religion, anthropology, math, science, etc. There are some very valuable historic papers that are in the public domain for each of these fields. We might as well shoot large and let people work their way into the project (I'm already involved in history, philosophy, religion, and theology papers).

reset

Alright, let's do it. Shall we call it Wikisource:Wikiproject Papers or Wikisource:Wikiproject Academic Papers ? Arlen22 (talk) 16:07, 11 September 2009 (UTC)

Let's do the second one, since it's more specific to the scope of the project.—Zhaladshar ^(Talk) 16:20, 11 September 2009 (UTC)

Done Wikisource:WikiProject Academic Papers Arlen22 (talk) 16:40, 11 September 2009 (UTC)

Concerns about fidelity of Internet Archive DjVu files

Following on from my "mystery symbol" discussion above, I now have some serious concerns about the nature of the DjVu encoding used by the Internet Archive, and whether the results can be considered faithful scans.

Here is an image of a paragraph taken from the raw tif files provided to the Internet Archive by Google Books:

And here is the same paragraph after the Internet Archive has encoded it into a DjVu file:

If you look closely you will see that

The R and E of "GREVILLEA" look quite different in the Google Books scan, but have been converted to exactly the same glyph in the Internet Archive DjVu file; and
The u's in "frutices" and "aemulis" have both been converted to what look like small-caps N's.

What worries me is that I can't see how this would have happened unless the Internet Archive's DjVu encoder knows something about what kind of glyphs to expect to find on a page, and is willing to take a guess as to which one is correct—a process tantamount to low-level OCR. If it is the case that the Internet Archive's DjVu processing is guessing glyphs rather than faithfully reproducing whatever it sees, then this casts serious doubt upon how we do our work here. What is the point of using scans to ensure fidelity, if the scans themselves lack fidelity?

Hesperian 01:29, 10 September 2009 (UTC)

Come to think of it, the encoder need not know about particular glyphs in advance. This output is just as easily explained by the encoder assuming that there are a relatively small number of glyphs, and trying to cluster the glyph instances that it finds into that number of glyph classes. But this is largely irrelevant; infidelity is infidelity whatever the cause. Hesperian 01:53, 10 September 2009 (UTC)

Looks bad Concern Arlen22 (talk) 01:36, 10 September 2009 (UTC)

Notice however, that it is correct where the same word appears at the bottom. Strange If in doubt, throw it out Arlen22 (talk) 01:39, 10 September 2009 (UTC)

Cygnis insignis has pointed out that w:JBIG2 probably explains what is happening here:

"Textual regions are compressed as follows: the foreground pixels in the regions are grouped into symbols. A dictionary of symbols is then created and encoded, typically also using context-dependent arithmetic coding, and the regions are encoded by describing which symbols appear where. Typically, a symbol will correspond to a character of text, but this is not required by the compression method. For lossy compression the difference between similar symbols (e.g., slightly different impressions of the same letter) can be neglected."

I suppose the issue for us is are we going to cop that? Hesperian 02:52, 10 September 2009 (UTC)

"The key to the compression method [JB2] is a method for making use of the information in previously encountered characters (marks) without risking the introduction of character substitution errors that is inherent in the use of OCR [1]. The marks are clustered hierarchically. Some marks are compressed and coded directly using arithmetic coding (this is similar to the JBIG1 standard). Others marks are compressed and coded indirectly based on previously coded marks, also using a statistical model and arithmetic coding. The previously coded mark used to help in coding a given mark may have been coded directly or indirectly."
— DjVu: Analyzing and Compressing Scanned Documents for Internet Distribution.[3] Haffner, et al. AT&T Labs-Research

— "So it goes", Vonnegut.
— Sigh, Cygnis insignis (talk) 03:52, 10 September 2009 (UTC)

Apparently the upshot of this is that this issue is inherent to DjVu, rather than specifically to the Internet Archive encoder. This is only the Internet Archive's fault inasmuch as they use very lossy compression. This is bad news all round. :-( Hesperian 04:10, 10 September 2009 (UTC)

It’s important to emphasize that this is a consequence of the specific compression settings IA has chosen for their djvu encoder. More reasonable settings can produce better results. I grabbed the topmost png image from this post, converted to PAM, and ran it through the c44 djvu encoder at the default settings. The result was:

Not perfect by any means—it would surely have been better to begin with the source TIFF, rather than a PNG; and tweaking the compression settings or using masks to isolate the foreground text could have produced a smaller file with comparable image quality. But a big improvement over IA’s scan, I think. I suppose the lesson here is to do our own djvu conversions whenever possible. Tarmstro99 (talk) 13:16, 11 September 2009 (UTC)

I'm not much involved in this project, but would it behoove us to make our own DJVUs for these works? If we can't even proof the scans, they're not much use to us.—Zhaladshar ^(Talk) 16:21, 11 September 2009 (UTC)

support Arlen22 (talk) 18:00, 11 September 2009 (UTC)

could you provide a link to the raw tiff and to the djvu at IA, where you spotted this ? ThomasV (talk) 18:52, 11 September 2009 (UTC)

The files available from IA are here. The example is derived from this page, this discussion follows on from the section above #Mystery symbol. Cygnis insignis (talk) 19:16, 11 September 2009 (UTC)

I see that there are actually two versions of the file online at Commons. The newer one (31 August 2009) is IA’s and contains the compression errors discussed above. The older one (10 August 2008) is GB’s and, at least on the page referenced above, is error-free. Perhaps rather than re-djvu from TIFFs, we could simply revert to the older, error-free version of the document that is already online? Tarmstro99 (talk) 18:59, 12 September 2009 (UTC)

Once I've taken full advantage of the OCR, I'll manually generate a smik DjVu from the jp2 images, and upload over the top. Hesperian 01:53, 18 September 2009 (UTC)

I think I have profected a process of taking the zip file of tiffs, uncompress the zip, converting the imagines to an uncompressed format (neccessary for the next step) and converting it to a nice high quality djvu file using gscan2pdf. The djvu is a small size with a higher quality than off of archive.org. Can anyone give me a text that is really bad or better yet, can we start making a list of text need replacement? --Mattwj2002 (talk) 14:16, 26 September 2009 (UTC)
- What's the process you use? I'm interested, because I'm trying to create DJVU files that are high quality but lower size. Right now I'm only getting pretty high-sized results.—Zhaladshar ^(Talk) 14:18, 26 September 2009 (UTC)
  - My process is pretty easy and involves using linux. In my setup I use Ubuntu. The first step is to download the zip files from the Internet Archive. :) Once the download is complete. You'll have unzip the file. This can be done either with the GUI or through the unzip console program. Then I use the following script to convert the tiffs to an uncompressed format (please excuse the messy coding):

#!/bin/bash
ls -1 *.tif | while read line; do convert +compress $line $i.tiff; echo $i; let i++; done
mkdir tiff
mv *.tiff tiff/

Then I take files in the tiff directory and use gscan2pdf to make a djvu file. The djvu appear to be roughly the same quality as the original tiffs and a good size. I hope this helps. --Mattwj2002 (talk) 18:42, 26 September 2009 (UTC)

- - - One other point, some of the tifs are also bad quality (from the Internet Archive). A good source might be pdf's from directly from Google. If you go that route, I recommend the following commands (please bare in mind it takes a lot of ram and time):

pdf2djvu -d 1200 -o file.djvu file.pdf

This can be done using Windows or Linux. I hope this helps. --Mattwj2002 (talk) 09:27, 27 September 2009 (UTC)

Another way

I do mine using ImageMagick and DjvuLibre, both freeware.

For typical black-text-on-white-paper pages, use ImageMagick to convert the tif/jp2/whatever to pbm format. PBM format is bitonal - every pixel is either fully black or fully white. Thus converting the bulk of a scan to this format gives you huge compression. Generally "convert page1.tif page1.pbm" gives you a sensible result, though you can fiddle around with manual thresholding if you want. It all depends on how much effort you are willing to invest in learning ImageMagick. DjVuLibre's cjb2 encoder will convert a PBM image into a DjVu file for you.

For pages with illustrations, convert to PGM for greytone images, or PPM for colour images. Then use DjVuLibre's c44 encoder to encode to DjVu.

Finally, use DjVuLibre's djvm to compile all the single-page djvu files into a single multi-page djvu. I find that listing all the files at once under the -c option doesn't work. You need to append one page at a time.

As for how to manage it all, rather than scripting, I find I get much more control and much more flexibility by enumerating the pages in a spreadsheet, and using formulae to construct the desired commands. e.g you can easily specify which pages should be treated as bitonal, which greyscale, and which colour, and define your formulae to produce the desired command for each case. Having done that, it is just a matter of copying a column of commands, and pasting it to the command line. It is a bit lowbrow, but it really does work well.

Hesperian 11:51, 27 September 2009 (UTC)

What's the quality of the DjVu you get when you a bi-tonal input? I've been using pdf2djvu because it doesn't reduce the colors of the PDF images when converting to DJVU and I get a nice, smooth looking result. Do DJVUs from bi-tonal images look good (or at least decent and not choppy) when all is said and done?—Zhaladshar ^(Talk) 13:01, 27 September 2009 (UTC)

Make up your own mind:

Pages from the Internet Archive version of An introduction to physiological and systematical botany typically look like this.
The IA version was missing a couple of pages, which I obtained elsewhere and shoe-horned in, having converted then to bitonal then encoded into DjVu using the bitonal encoder. Those pages look like this and this.

Hesperian 13:17, 27 September 2009 (UTC)

That answers my question. Thanks. :) The quality isn't bad at all.—Zhaladshar ^(Talk) 13:24, 27 September 2009 (UTC)

To give an idea of what can be achieved, I managed to fit File:History of West Australia.djvu into 69Mb—not bad for 652 physically large pages at 200 dpi, including about forty plates that had to be retained in greyscale. It works out to about 250 pixels per bit; that's some serious compression. Hesperian 13:43, 27 September 2009 (UTC)

Redirects in Index namespace

Clicking the Edit button on an Index: page brings up (I believe) our {{Index}} template, which is ordinarily very helpful, giving the user lots of fields to fill in to supply the information needed for the index. Is there any way to disable this behavior? I would like to convert the (now unneeded) page Index:United States Statutes at Large/Volume 1 to be a redirect to Index:United States Statutes at Large Volume 1.djvu, where the underlying page content now exists. But there seems to be no way to do this. Am I missing something? Tarmstro99 (talk) 16:16, 10 September 2009 (UTC)

I couldn't circumvent the file's template MediaWiki:Proofreadpage_index_template using AWB, though it allows me to see the text, so it is pretty robust, and this is going to be up to ThomasV (talk • contribs) to answer. It may be something appropriate for oldwikisource:Wikisource_talk:ProofreadPage. -- billinghurst (talk) 01:38, 11 September 2009 (UTC)

When I wanted to create page scan indexes for a large number of issues of The Perth gazette and Western Australian journal, I asked Thomas whether it was possible to get direct access to the page code so that I could copy-paste the template. His suggestion was that I temporarily disable Javascript in my browser. It worked for me. Hesperian 05:29, 11 September 2009 (UTC)

Interesting as AWB doesn't display the page in its Edit box, it shows me

{{:MediaWiki:Proofreadpage_index_template
|Author=[[Author:United States Congress|United States Congress]]
|Title=[[United States Statutes at Large]], [[United States Statutes at Large/Volume 1|Volume 1]]
|Year=1845
...
|Remarks=
}}

and when I replace all that text with #REDIRECT[[Index:United States Statutes at Large Volume 1.djvu]] it simply won't save that as replacement text. So I would be interested to know if turning off the JS works. -- billinghurst (talk) 07:09, 11 September 2009 (UTC)

I thought of another solution, though it is a dirty hack. If you want a redirect from A to B, simply move B to A and back again. Hesperian 11:24, 11 September 2009 (UTC)

The suggestion to disable JavaScript worked. Thanks to all for your suggestions! Tarmstro99 (talk) 17:28, 11 September 2009 (UTC)

Print Version

What do you do on here for a print version? On wikibooks you have a page called [[book name/Print Version]]. I tried it on here, but apparantly they won't transclude if they make the page too long. Arlen22 (talk) 16:34, 10 September 2009 (UTC)

Header problem(s)

Hello gang,

I've tried to create a custom header for the articles in the Executive Orders category but for some reason it tells me that no header is applied when I try to save one with my new header in place. See Executive Order 13502 for an example and the new header I'm trying to apply is Potus-eo. Thanks. George Orwell III (talk) 21:29, 8 September 2009 (UTC)

Fixed. I changed the filter to recognize any {{header}} template (based on its ID in the new HTML). This avoids the need to manually maintain a whitelist of such headers. —Pathoschild 23:42:19, 08 September 2009 (UTC)

Thanks - that seems to have done the trick. George Orwell III (talk) 00:32, 9 September 2009 (UTC)

Err... I made some revisions to that Executive Order template and now I'm seeing the 'missing header' message when I go to apply it again. Sorry to be a bother. On an unrelated note: Anybody know how to fix a partially corrupted PDF? George Orwell III (talk)

Still seeing that message. Left a note at BookofJude's page as well about it too. George Orwell III (talk) 22:48, 21 September 2009 (UTC)

As I mentioned on Due's page, the file seems to lack the necessary components of a recognised header (somewhere there is a discussion to this), to whit you should chat directly to Pathoschild, as per [[{{DGRA}} header template]] on that page. -- billinghurst (talk) 13:50, 24 September 2009 (UTC)

as I replied to you there soon after - DISREGARD - I figured out it already. I sincerely Thank You again for your attention. George Orwell III (talk) 14:45, 24 September 2009 (UTC)

The Grammar of English Grammars

To my userspace I have uploaded The Grammar of English Grammars from Project Gutenberg; also its table of contents.

The work is basically unformatted; I have only replaced underscores indicating italics with ' ticks for Mediawiki.

The work is divided into large chunks, the largest chunk having some 300,000 words.

Now I wonder whether I should move (or copy) the work to the main space, in the raw state in which it now is. It is nevertheless complete; the complete text has some 6 MB. And in Wikisource, it is easier browsable than in Project Gutenberg. What do you think?

Of course, I would add the top headers to the pages – that's not a lot of work – and I would pick appropriate work name and chapter names instead of those currently used in my userspace. --Dan Polansky (talk) 13:47, 11 September 2009 (UTC)

I don't see why you couldn't. Just put some maintenance templates on there so that it gets categorized well and people can tell if it needs work or not. Just a cursory glance makes it seem like this work will need a ton of formatting. It might also be worth trying to find a DJVU of the book so that we have a nice rubric to go on when formatting it.—Zhaladshar ^(Talk) 14:05, 11 September 2009 (UTC)

Further comment would be that if you can identify any menial tasks, then we can look to get bots to undertake those components. Add them at Wikisource:Bot requests.-- billinghurst (talk) 07:40, 12 September 2009 (UTC)

Can I upload a scan of a PD(?) book and how?

These may be obvious questions, but, as a rather new user to this site, even after much searching, I still do not know the answers:

I am considering whether to scan a 2005 book which is a facsimile reproduction of an older book, which would seems to me to be PD at least in the United States. The original book was published in Trieste in 1907, and the author died in 1924. There is no editing or added material in the facsimile edition, and even the page numbers are reproduced from the original. (The original book does not seem to be available at Google Books, and the facsimile edition is available only in limited preview.)

I would like to scan the book into a multipage file (I know how to do this for PDF format) and would like to upload the file to some place, where it would be available to anyone for download and reading.

Best way to have files is as a DjVu file and there is info at that page about how to prepare. We can offer further advice, so rather than tell you about sucking eggs, read first, and we can progress. Want it to have been OCR'd before it gets converted. Further answers below. -- billinghurst (talk) 07:37, 12 September 2009 (UTC)

My questions are:

1) Would scanning such a book and uploading it violate any copyrights?

If it is an exact facsimile, I don't believe that the newer copy could be copyrighted. There is no violation of the original copyright.

2) Is Wikisource the right place to upload the file?

The preference is that files are uploaded to Commons, if they meet the copyright criteria. They can be directly linked to from Commons to WS. Information at Side by side proofreading.

3) If so, how do I upload the file?

Commons:Special:Upload

Thanks for help! --Robert.Allen (talk) 06:52, 12 September 2009 (UTC)

Thanks for the prompt help. I forgot to mention an important point: the book is in German, I have now learned I will need to add it to the German Wikisource until a translation into English is made (if ever). I have created an account on that site also. --Robert.Allen (talk) 02:32, 13 September 2009 (UTC)

Instructions basically don't change, as same reasoning, same rules, across the WS domain. In fact even more important that it goes to Commons, as the image is then globally available, whereas uploading image file to enWS, limits it to enWS. -- billinghurst (talk) 03:05, 13 September 2009 (UTC)

Good to know, since I'm pretty slow reading German. Thanks for all the help! --Robert.Allen (talk) 19:41, 13 September 2009 (UTC)

Meta-text environments and audience customizability

I have been contemplating the audience of Source and customizability of our meta-viewing environment. Lets say we have a text. The text is extant in three versions in one script and there is a salient variation in each version. There are also extant versions in two other diverse scripts, translations done at different timeperiods. Then there is to be a transliteration or possibly transliterations, lets say a romanization for example and IPA. Then there is an English translation and a French translation done by gifted translators/linguists. There is also a verse by verse purport and commentary in both English and French done by different respected scholars. All of these options are in the public domain and are uploaded in Source. Each person who engages the text may wish to foreground different possibilities in juxtaposition. A person may wish to view a verse of the oldest extant text in indigenous script in juxtaposition with the English translation. But another person would like to look at the IPA in juxtaposition with the purport/commentary for example. Another person may want to juxtapose two of the extant sources with annotations of salient differences and the historical dimensions of the text. Another person may wish to juxtapose the French and the English purports. Do we have such functionality? I would appreciate some direction. I appreciate we will soon have powerful translation tools embedded within Source, but this is something different. This is enabling the audience to engage a rich textual tradition in a way appropriate to their needs at a given time. Please post the response on my Source chittychat page. Moreover, annotations: is there a way to turn on and off meta-text annotations to a text?
Respectfully
B9hummingbirdhoverin'^chittychat 15:29, 12 September 2009 (UTC)

A person can always view two pages in different tabs or windows of his browser. Is that what you mean? Arlen22 (talk) 19:30, 12 September 2009 (UTC)

We do have a "side-by-side" view capability, for viewing a translation and an original alongside each other in a single page. But in general we are year behind when it comes to the kinds of advanced useability functions that you are talking about here. Hesperian 01:51, 18 September 2009 (UTC)

Help with incomplete DjVu

Recently, I uploaded File:A History of the University of Chicago by Thomas Wakefield Goodspeed.djvu from archive.org and have been working on it at Index:A History of the University of Chicago by Thomas Wakefield Goodspeed.djvu. However, I have since realized that at least one page is missing (page 222-3), a couple pages are cut off, and some pictures and the title page are repeated (once in decent quality and once in very poor quality). Initially I tried converting the .pdf at Google Books, but that creating some bizarre issue at Commons (there's a discussion of it above). There are other versions on archive.org, but they all have it with two pages per image. So my question is: is there a way to deal/split two-page djvu files? And if not, would anyone like to take a stab at converting the Gbooks pdf into a djvu file? Thanks DroEsperanto (talk) 23:03, 14 September 2009 (UTC)

DjVuLibre's djvm can be used to delete a single page or insert a single page, but there seems to be no feature for extracting pages. However you can write a batch script to delete all but one page from a file, then insert it into the other where it belongs. And then do the same for the other. Sounds tedious, I know. It is. Hesperian 23:11, 14 September 2009 (UTC)

From a djvu file named file.djvu to extract page 33 into a single page djvu file named page_33.djvu, use

djvused -e "select 33; save-page page_33.djvu;" file.djvu

; then you can use djvm to insert it elsewhere. Phe (talk) 19:48, 15 September 2009 (UTC)

Question about logo

What does the image of the iceberg intend to convey? -- OlEnglish (talk) 17:21, 16 September 2009 (UTC)

not involved with the choice in times past, though one would think that it infers to the aspect that only about 10% is visible above the surface, with 90% not visible below the surface. -- billinghurst (talk) 20:13, 16 September 2009 (UTC)

When first begun "Wikisource" was called "Project Sourceberg" as a play on "Project Gutenburg" and the "Sourceberg" image of an iceberg was developed during that period.--BirgitteSB 23:39, 16 September 2009 (UTC)

Ah, this is a useful information. Now look back, I don't like the idea of playing words in English, while there are over 50 languages for this project now. Vinhtantran (talk) 09:20, 23 September 2009 (UTC)

Asking for bot permission

After reading Wikisource:Bots, I'd like to ask bot flag for my bot.

Bot account: User:TVT-bot
Owner: Vinhtantran@vi.wikisource
Functions: Interwiki
Framework: pywikipedia
Automatic: No/Manual
Namespace: main namespace, Wikisource, Help, Author

I will make some test edits soon. Vinhtantran (talk) 01:20, 18 September 2009 (UTC)

Does your script run interwiki programming in the main namspace? If so, how do you avoid the problem that the works in wikisource are do not have 1:1 ratio across languages?--BirgitteSB 23:42, 18 September 2009 (UTC)

Definitely would like to see the bot in action, and the information about the namespace(s) in which you are looking to undertake the interwiki. Generally we wish to see the bot in action following approval to act, then once it has proved and that it suits needs, and doesn't break anything, then it will be given a bot flag by our 'crats. -- billinghurst (talk) 03:06, 19 September 2009 (UTC)

I'm currently using Python Wikipedia Framework, run manually and run from vi.wikisource. That means I will only update interwiki if there is new pages in vi.wikisource, so the traffic will be low. Moreover, I fully controlled the bot when it's running, it will announce me when it finds many interwiki to the same project, and I will press to ignore those pages. The only problem I have found is "zh.wikisource", they are using many alias for one project, like "zh-tw" and "zh-cn", so I will ignore these pages, too. Vinhtantran (talk) 11:59, 22 September 2009 (UTC)

Support trial. -- billinghurst (talk) 13:19, 24 September 2009 (UTC)

{{copyright author}} from WS:COPYVIO

At Wikisource:Possible copyright violations the following info was brought up a point about {{copyright author}} ...

“	Although the work has already been deleted, and the discussion closed, I am briefly reopening this section in order to draw attention to and promote the use of the template: {{Copyright author}}. This template, originated by John Vandenberg, to be placed on the author page, says that a preliminary search indicated the author has no public domain works with the possible exception of little-known early works. It is already being used on pages of modern authors such as Langston Hughes, W. H. Auden, Hannah Arendt, Kurt Vonnegut and John Steinbeck. Neither of the Wikisource veterans Billinghurst nor Eclecticology seemed to be aware of the template, so I suspect other interested users and admins didn't either. I wonder if the this copyright notice might be more appropriately featured in a more prominent position on the page (it's at the bottom on some of the pages of the authors I listed) especially when the {{populate}} template is used, which sends mixed messages in any case and might call for that template's removal from that page altogether.	”
—ResScholar

So initially this is to share.

I have added this tag to Help:Copyright tags
Looking at our guidance at Wikisource:Style guide, it would seem too specific to add there; though maybe not; alternatively
It may be something that we could add to {{author}}
Re Positioning. If we apply the tag, I am wondering whether we really are going to be asking them to {{populate}}

-- billinghurst (talk) 08:35, 19 September 2009 (UTC)

Wikisource:Scriptorium/Archives/2008-02#Proposals contains the archived discussion where John Vandenberg proposed the use of the template under the name of {{Author-PD-none}}. ResScholar (talk) 23:06, 19 September 2009 (UTC)

I'm not a real big fan of this template. US authors who started publishing after 1963 probably don't have any works in the public domain. US authors who published before that could have works public domain up to the line, and not always obscure ones, either. (Harry Harrison's Deathworld is one of his most well-known works, and was not renewed.) Non-US authors who didn't publish in the US almost certainly have nothing post-1923 in the public domain, unless they died early enough to fall into the public domain in their home country.--Prosfilaes (talk) 01:15, 20 September 2009 (UTC)

Family History

As my Mother is from France, I'm attempting to do a family search on her side as well as the ones here in the United States. She has often told me that her family came from a Nobel background. The original family name was Leclairc, with the c on the end. She also states that one of my ancestors was a General under the "Little General", in France. General Leclairc. Any one that has information or knows of a source of information I would be greatly indebited to hear from you!—unsigned comment by Bigrockerguy (talk) .

Our project is reproducing sources, and being able to point to specific genealogical sources for your research is beyond our remit. You would be better to try a site like RootsWeb for that advice. -- billinghurst (talk) 02:05, 20 September 2009 (UTC)

Reference inside reference

How to handle note inside note like in this page ? Phe (talk)

You can do it with grouped footnotes, but the outer <ref> tag has to be replaced with a call to the {{#tag}} magic word. For example,

Text with footnote{{#tag:ref|Footnote with nested footnote.<ref group="nested">Nested footnote.</ref>|group=top}}

<references group="top"/>
<references group="nested"/>

yields

Text with footnote^{[top 1]}

↑ Footnote with nested footnote.^{[nested 1]}

↑ Nested footnote.

Hesperian 02:13, 21 September 2009 (UTC)

Thanks, I tried [4] to figure if it can work without group but the note order are reversed. 16:03, 21 September 2009 (UTC)

Yes, I see the problem. I don't think there is a solution to that. Hesperian 02:37, 24 September 2009 (UTC)

DjVu navigation buttons?

Currently, the only way of navigating a scanned page is to click the mouse (toggles between normal size and full size) and to drag it (moves the fully zoomed page around).

However, since some users may not be accustomed to this, it would be nice if the DjVu pages also had these features:

four directional navigation buttons
a way to adjust the zoom more finely, such as "+" and "-" buttons, or a slider bar
a button for resetting the zoom

How difficult would it be to implement these changes? --Ixfd64 (talk) 18:05, 22 September 2009 (UTC)

On my system, at least, turning the mouse scroll wheel zooms in and out in small increments on the portion of the image directly under the pointer at that moment. I don’t think this is due to any idiosyncratic setup of mine (Firefox 3.5, WinXP). I do agree that this could all be better documented, and that any investment of effort into making the proofreading process easier on new users is likely to pay significant dividends. Tarmstro99 (talk) 18:18, 22 September 2009 (UTC)

It seems that there are buttons for changing and resetting the zoom when the user is in editing mode. However, those buttons are not present when the user is not editing the page. It would be useful if those buttons were available in reading mode as well. Also, it would probably be less confusing if the buttons were located above the scanned page rather than the editing window. --Ixfd64 (talk) 19:03, 22 September 2009 (UTC)

The current arrangement works for me, the effect on page loading is an issue. The zoom is intuitive in view mode, a reader or editor can verify something from this one in most cases, buttons and full zoom are redundant. The edit mode is a customisable and larger page, if I need the greater zoom it is likely I will be editing the page. The scroll wheel zoom, in edit mode, is universal I think. The buttons might be better placed on the right of the toolbar, IMHO. Cygnis insignis (talk) 09:03, 23 September 2009 (UTC)

Code update

Yay! a new special page and coloured quality indicators. The announcement is here Detailed help and description will follow. ThomasV (talk) 20:56, 24 September 2009 (UTC)

Winner ThomasV Congrats and thanks! -- billinghurst (talk) 23:38, 24 September 2009 (UTC)

Great. I'm looking forward to seeing how this <pages> tag works; it sounds like it is going to change how we lay out pages quite a lot. Hesperian 01:34, 25 September 2009 (UTC)

Socialist Unity

I added the text Socialist Unity. I thought the text was translated by Ted Crawford, but it turns out it was transcribed by him. Since we don't know when this transcription took place, do we have to delete this text? When the work itself was published it was within the public domain era. To make matters more confusing, this is a translated text. Any help would be greatly appreciated. Thanks. --Mattwj2002 (talk) 20:03, 25 September 2009 (UTC)

The work, as translated, needs to have been published before 1923, or fall under some other PD condition.--Prosfilaes (talk) 22:47, 25 September 2009 (UTC)

Unwatched pages

I know this is considered somewhat sensitive information, so if someone feels like deleting this edit, I won't be upset--though I will think it a little excessive. But we have a huge number of Encyclopedia Britannica pages on the unwatched list, and it would be helpful if some of the EBers started putting some of them on their watchlists. The list of unwatched pages is a bit long, and this is one piece that seems to me like it go pretty easily.--Prosfilaes (talk) 02:19, 26 September 2009 (UTC)

film and drama scripts

I know this may be a silly idea, but would it be reasonable for Wikisource to include scripts (those whose copyrights have expired, of course) for films and other works of drama, such as notable plays? --Ixfd64 (talk) 16:37, 26 September 2009 (UTC)

If it is published and in the public domain, I believe that it is fair game, as per WS:IO, If we have sheet music, I am not sure that we could exclude plays. -- billinghurst (talk) 16:50, 26 September 2009 (UTC)

I concur with billinghurst. I don't think we can reasonably exclude scripts (nor do I see why we would want to), so if you can find PD scripts, by all means add them.—Zhaladshar ^(Talk) 16:54, 26 September 2009 (UTC)

newsletters from the Army 1783rd Engineers Supply Company

We have uncovered a series of newsletters from the Army 1783rd Engineers Supply Company near the end of world war II. Each is typically 6 pages and provides an interesting insight to events and oversea's life in the army. We are scanning them in as part of our family history, but thought they might be worth of an archive somewhere for the future. Is this the right place to think about archiving them? They are images of text (typewritter) documents.

Any advice on this?

Samples at: [5]

Guy Hall —unsigned comment by Guy Hall (talk) .

Sure, we'll accept them.--Prosfilaes (talk) 03:20, 27 September 2009 (UTC)

A follow up: Found a fragile but complete original of the Stars and Strips 1945 Souvenir Pacific Christmas Edition. Have found references online to copies in library or museum collections, but no on-line images. Are there copyright issues? No indication of restriction in the newpaper. Can this be legally uploaded?

The Stars and Stripes can be legally uploaded.--Prosfilaes (talk) 03:45, 27 September 2009 (UTC)

Ideally I would have thought that they would be loaded at Commons:, and we can load them in from there. Otherwise, we could look to upload them to WS and move them to Commons as able. The other thing to ask is how are you scanning them, and compiling them? Ideally if they could be compiled as multi-paged editions (TIFF, DjVu or PDF) would be best-- billinghurst (talk) 07:26, 27 September 2009 (UTC)

Copyright on Commons (and Beatrix Potter)

I'd like to note that the Wikimedia Commons uses different copyright policies than we do; they don't host a work until it's out of copyright in its home country. So images from books like those of Beatrix Potter can't be hosted on Wikimedia Commons until life+70 has expired.--Prosfilaes (talk) 16:47, 28 September 2009 (UTC)

That will partly be my fault. I was talking to the person about what we could and could not transfer to Commons in the general principles, explaining it needs to make the Commons criteria. I gave examples of certain works. I didn't put 2 & 2 together when he started moving over other specific works, my brain was past thunking at that point, and not in the deletion phase, though I was awake enough to put bot copyright tags onto them. -- billinghurst (talk) 08:01, 29 September 2009 (UTC)

Indexpages display

Hello, Kipmaster suggested to display the Special:Indexpages this way, and I like it! Just copy my User:Yann/monobook.js or add importScript('User:Yann/monobook.js');. Don't forget shift+reload. Yann (talk) 08:45, 30 September 2009 (UTC)

Looks interesting. I copied your script to a gadget, to simplify using it. -Steve Sanbeg (talk) 18:12, 30 September 2009 (UTC)

Is something wrong with the gadget? When I use it and go to Special:IndexPages, I'm getting these errors.—Zhaladshar ^(Talk) 21:56, 30 September 2009 (UTC)

Works fine for me, in Firefox; I just copied over the comment from the script (works in FF, definitely not in IE...) to the gadget description, since a lot of people may not be able to use it yet because of that. -Steve Sanbeg (talk) 22:56, 30 September 2009 (UTC)

It doesn't work for me in either FF or Chrome. Could it be conflicting with one of my other gadgets?—Zhaladshar ^(Talk) 23:23, 30 September 2009 (UTC)

That's not it. Even disabling all other gadgets I get the error.—Zhaladshar ^(Talk) 23:25, 30 September 2009 (UTC)

Well, I figured it out. It's something in my monobook.js that does it. No idea which one as of yet that is doing it.—Zhaladshar ^(Talk) 23:28, 30 September 2009 (UTC)

Importing of Doctrines of The Salvation Army

Would it be OK to import the above text, its from SAWiki (Salvation Army) which is under Creative Commons. I know its PD because it was written by William Booth 1829 - 1912 founder and first General of The Salvation Army. The category would be Articles of faith.--kathleen wright5 (talk) 23:04, 30 September 2009 (UTC)

Absolutely, just be sure to add it to Wikisource:Salvation Army (and if the subject interests you, perhaps copy over some of the other works listed there as well) Sherurcij ^{Collaboration of the Week: Author:Carl Linnaeus.} 23:08, 30 September 2009 (UTC)

[2] Footnote with nested footnote.^{[nested 1]}

[1] Nested footnote.

[top 1]

[nested 1]

Wikisource:Scriptorium/Archives/2009-10