Wikisource:Bot requests

Bot requests

This page allows users to request that an existing bot accomplish a given task. Note that some tasks may require that an entirely new bot or script be written. This is not the place to ask for help running or writing a bot.

A bot operating performing a task should make note of it so that other bots don't attempt to do the same. Tasks that are permanently assigned or scheduled for long-term execution are listed on Persistent tasks. See also Wikisource:Bots.

Move all subpages of Who's Who in the Far East to use title caseEdit

I was informed by User:Beeswaxcandle that I should use title case instead of all caps in article names. So I request to move all subpages of Who's Who in the Far East to use title case. Although I can use a bot to move it myself, that would leave tons of redirects for admins to delete. But if an admin can easily batch-delete a list of pages, I can move it myself and then provide the list of pages to delete. I'm sorry for the inconvenience. Thanks, --Stevenliuyi (talk) 08:58, 6 May 2021 (UTC)Reply[reply]

@Stevenliuyi: Please review the list at Wikisource:Bot requests/sandbox. I notice that there is at least one English name that needs to be fixed, and the Chinese names didn't convert on the regex that I used. Would you fix or create the target (only) in the list in the pair list, and I will get it done. No need to fix those that are broken though you should fix the previous/next links of the articles either side. To note that as I did for your other work, I will look to get a work specific template in place, though will do that afterwards. — billinghurst sDrewth 13:10, 24 May 2021 (UTC)Reply[reply]
I suppose that I really to want to ensure that the Chinese names are capitalised properly. — billinghurst sDrewth 02:57, 25 May 2021 (UTC)Reply[reply]
@Stevenliuyi and @Billinghurst: Has this request been actioned (i.e. can it be closed as resolved)? Xover (talk) 10:34, 10 April 2022 (UTC)Reply[reply]
@Stevenliuyi: Please see Billinghurst's request (above) for quality control of the list of targets in Wikisource:Bot requests/sandbox. They have done the legwork to prepare for the move, but it is unable to progress until you've checked and corrected the target page names. Xover (talk) 05:33, 3 September 2022 (UTC)Reply[reply]

Add {{R from case citation}} to all redirects from case citations to casesEdit

Would it be possible for a bot to detect redirects pointing from a case citation (e.g. 347 U.S. 483) to a case (here, Brown v. Board of Education), and add {{R from case citation}} to these. It would be useful to have all case citations in one category, given the obvious utility of Wikisource as a caselaw database for lawyers. BD2412 T 04:25, 24 May 2021 (UTC)Reply[reply]

@BD2412: What's the identifying feature of a case page? Is it in a particular category? Does it contain a unique template? Once identified, should all redirects to these be so tagged, or only a subset of them? Do the redirects to these have a distinguishing feature to identify them as distinct from all other redirects on the site?
Adding a category to a set of pages is pretty straightforward, so the challenge is how to identify those pages automatically. Doing something to one page (the redirect) based on properties of another page (the actual case page) can also be challenging depending on the details (it may require writing a custom bot rather than just running one of the existing scripts for pywikibot).
Also, how many of these are there? If the criteria are complex, and the number of pages relatively low, it may be better to do it manually or semi-automated (a user script in the browser that finds and tags redirects to the current page on request, say). Xover (talk) 04:06, 25 May 2021 (UTC)Reply[reply]
I have been thinking about exactly those issues. I don't know that we have any case citation redirects from non-U.S. cases, so the group to start with would be documents in the category tree under Category:United States case law by court. The redirects themselves will all be in a [Number] [Reporter] [Number] format, so for example the first page of results from a search for pages starting with "1" is almost entirely redirects to cases (everything from 100 L.Ed. 1003 on, with a few exceptions). So, anything in that format redirecting to something in that category tree should be a case citation redirect. I would also note that a great many of these were generated by User:BenchBot when that bot was active, and just counting those from the bot's contributions, there are over 9,000. I suppose I could do those manually, or use BD2412bot, but that will leave stray case citation redirects added by others. BD2412 T 04:25, 25 May 2021 (UTC)Reply[reply]
I am doing some manually to see if there are any hitches that come up that way. BD2412 T 19:30, 25 May 2021 (UTC)Reply[reply]
@BD2412: I have to admit I completely forgot about this request. Sorry. If the page selection logic is "Check all pages in Category:United States case law by court for incoming redirects, and pick the redirects whose page name matches [Number] [Reporter] [Number]" then I think it's probably doable. It's going to require writing a custom bot though, which I've not yet done so I'd need to find time for both the bot coding and the learning curve (which means it'll be a while before I might tackle it).
Mpaa is vastly more skilled than me in this area though, so perhaps they could be persuaded to help?
Hmm. Or possibly there is a Toolforge tool for querying the wikis for pages that match these criteria? Once we have the list of redirects adding the template to all of them is trivial; it's getting the list of pages (redirects) that's a little bit challenging. Quarry maybe? That'll need understanding SQL JOINs, which make my head hurt, but should be doable for someone with a bit of DBA in their mix. PetScan is also often great for this kind of thing, but I don't think it can be query for incoming redirects like this. Hmm. And you could probably do this on-wiki in JavaScript too, come to think of it. Xover (talk) 06:26, 3 September 2022 (UTC)Reply[reply]
I believe I did most of these manually at some point. Still, there will always be new ones. BD2412 T 06:32, 3 September 2022 (UTC)Reply[reply]
Oh, ok. Should we consider this request resolved then (so it'll get archived)? Or were you looking for a permanent bot task to do this automatically as new cases are added? Xover (talk) 06:45, 3 September 2022 (UTC)Reply[reply]
A bot to pick these up as new cases or added would be nice. I don't think I could track that manually. BD2412 T 19:01, 3 September 2022 (UTC)Reply[reply]
@BD2412, @Xover
To summarize:
1. walk recursively Category:United States case law by court
2. find redirects that point to pages in above categoryies
3. if redirect title matches '\d+ [^ ]*? \d+' and does not contain {{R from case citation}}, append it
I made a test for some articles in Category:United States Supreme Court decisions in Volume 107.
Are these edits OK? Mpaa (talk) 22:46, 3 September 2022 (UTC)Reply[reply]
@Mpaa: yes, those are absolutely correct. BD2412 T 23:04, 3 September 2022 (UTC)Reply[reply]
@BD2412 ongoing. This query gives pages pointed by a redirect matching the regex but not yet tagged with {{R from case citation}}. From here there are several possibilities to ensure they are pages of interest. I intersected them with pages belonging to (subcategories of) "Category:United States case law by court". Mpaa (talk) 21:20, 4 September 2022 (UTC)Reply[reply]
Done. Mpaa (talk) 21:04, 5 September 2022 (UTC)Reply[reply]

Wikidata bulk editEdit

I made a query for works on enWS that have WD items with no "instance of" statement. The criteria I used are:

  • Pages in mainspace
  • No redirects or disambiguation pages (this includes Versions and Translations btw)
  • Does not contain a forward slash in the page name (in order to exclude subpages)
  • Is linked to Wikidata, and linked Wikidata item does not have a P31 statement

This query returns 13889 results, which is more than even QuickStatements can handle. Would it be possible for a bot to update these Wikidata items with P31=Q3331189 (instance of = version, edition, or translation)?

Thanks :) —Beleg Tâl (talk) 13:22, 1 November 2021 (UTC)Reply[reply]

I think we could be more specific for certain groups, e.g I have addressed "Presidential Radio Address" articles as "instance of speech". There are several groups of articles that can be identified and then addressed with QuickStatements. After that, the bot can be run on what is left. Mpaa (talk) 23:13, 1 November 2021 (UTC)Reply[reply]
@Mpaa: Except they are editions as we host them, the speech would be the parent to the item, per d:WD:Books as there may be other published editions of the same speech. — billinghurst sDrewth 12:17, 5 September 2022 (UTC)Reply[reply]
@Billinghurst I see. I saw other were linked that way and I followed along. If it is not correct, it should be cleaned up but I do not master wikidata tools enough to write a bot for it. Mpaa (talk) 21:34, 5 September 2022 (UTC)Reply[reply]
We desperately need better Wikidata tools (so we're not dependent on Billinghurst to be on eternal vigilance here). But the current gadget we have for this is loaded from some user's personal page on Russian Wikisource (which is kinda iffy in itself these days), and its code is completely incomprehensible. If anybody knows of or runs across good API docs for how to talk to Wikidata I'd be very interested. As far as I can tell, the only existing API is the main MW:API with some very minor additions for WD, and that's way way too painful to use for our purposes. Xover (talk) 06:15, 6 September 2022 (UTC)Reply[reply]
@Xover: Maybe we should just be bold and create a phabricator task and see where we go. We probably should have put this into the desired toys to be built for 2023, though we have missed that boat as it is currently in final stages of voting (I think). — billinghurst sDrewth 05:40, 22 February 2023 (UTC)Reply[reply]
User:Beleg Tâl why not just do it with Petscan itself, from memory it could additions. Also note that there is the interwiki Petscan: for these. — billinghurst sDrewth 12:14, 5 September 2022 (UTC)Reply[reply]

  Comment wondering whether we need to chip out components of this task. For example, something like petscan:23959659 shows works using {{Act of Congress}} which would not be edition, and would instead by another item, and they also have components that could have other elements added through QuickStatements. Yes, this will still need a large slab of works that need version, edition, or translation (Q3331189) added, though at least it will allow for something less than the blunderbuss approach. — billinghurst sDrewth 05:24, 27 February 2023 (UTC)Reply[reply]

The three volumes of The Last Man only have a different title page between the first and second edition, could the proofread text of the three-volumes of the second edition be copied to the scans of the first edition. Languageseeker (talk) 23:29, 16 July 2022 (UTC)Reply[reply]

@Mpaa If it is OK to copy also the Page status, better wait for all 3 vols to be validated. Mpaa (talk) 13:53, 18 July 2022 (UTC)Reply[reply]
Makes Sense. 13:49, 22 July 2022 (UTC)

This one's a bit more complicated than my previous request. I'm working on doing the styles with templates and CSS classes. Fortunately, the users who worked on these pages have been remarkably consistent with the formatting markup, so it's pretty easy to update with find and replace (I've done a couple pages). The replacements I'd like for each subpage, in order, are:

Replace

{|cellspacing="0" cellpadding="6" border="0" width="90%" align="center"|

with

{{World Factbook 2004 table/header}}

Replace

{|width="90%" border="0" cellspacing="0" cellpadding="6" align="center"|

with

{{World Factbook 2004 table/header}}

Replace

!align="left" valign="middle" width="20%" height="31" style="background:#CCCCCC;"|[section]
!align="left" valign="middle" width="80%" height="31" style="background:#CCCCCC;"|[country]

with

{{World Factbook 2004 table/section|[section]|[country]}}

Replace

! width="20%" align="right" valign="top"|

with

!|

Replace

| width="80%" align="left"  valign="top"|

with

||

Delete

{{anchor|[section]}}

Delete

|}
{{World Factbook 2004 table/header}}

Delete

|}

{{World Factbook 2004 table/header}}

For clarity, [words in brackets] are placeholders, and a blank line at the end of a code block means to include the ending linebreak in the pattern. Thanks! —CalendulaAsteraceae (talkcontribs) 05:17, 24 January 2023 (UTC)Reply[reply]

Please convert any straight quotes to curly quotes. It seems Vol 1 (although I haven't checked through all of it) and much of Vol 2 have curly quotes, except near the end of Vol 2, the consistency is lost. Of course, if you feel like doing any other cleaning up (spaces around semicolons or hyphens, or missed nops), feel free, although I have hopefully caught most of them (again, mainly near the end of V2). Thanks, TeysaKarlov (talk) 21:54, 4 February 2023 (UTC)Reply[reply]

done. Mpaa (talk) 17:48, 12 February 2023 (UTC)Reply[reply]
@Mpaa Thanks! If I might ask one more thing, is there an issue with my Historic Highways request on the Scan Lab, or in just my foolish forgetting to ping the Scan Lab, did it go unnoticed? Thanks again, TeysaKarlov (talk) 21:18, 19 February 2023 (UTC)Reply[reply]

Only proofread text in Index:The Algebra of Mohammed Ben Musa (1831).djvu —Preceding unsigned comment added by 82.167.153.5 (talk) 04:40, 23 March 2023 (UTC)Reply[reply]

  Not done unsure what is even being asked here. — billinghurst sDrewth 06:14, 11 April 2023 (UTC)Reply[reply]

As discussed here, I'm working on updating some of the heading templates. These are uses of {{chapter heading}} in the Page namespace, with no additional parameters, where I've already set up the index to use {{pseudoheading}} and maintain the same styles: https://petscan.wmflabs.org/?psid=24006062. I would really appreciate it if, in the pages listed in the petscan, {{chapter heading| and {{ch| could be replaced with {{pseudoheading|. Thanks! —CalendulaAsteraceae (talkcontribs) 05:36, 7 March 2023 (UTC)Reply[reply]

@CalendulaAsteraceae: Not certain why we are doing any of this? template:chapter heading is not something that the community has had any consensus to do, and I would propose that it is out of sync with WS:Style guide and the presentation of text and the use of display layers. and peoples ability to have their own CSS as they require. Text is king. — billinghurst sDrewth 21:34, 7 March 2023 (UTC)Reply[reply]
@Billinghurst, I'm having trouble parsing this. I understand that you're raising concerns about my approach to template creation and/or use, and I would like to understand your concerns better so I can respond appropriately. In particular, is there a verb missing from the beginning of "template:chapter heading is not something that the community has had any consensus to do"?
My understanding had been that templates like {{pseudoheading}}, which are designed for use with index CSS, are essentially meant to be an alternative to layering formatting templates like à la {{center|{{larger|Chapter 1}}}} in cases where there's consistent formatting throughout a work. Personally, when I'm proofreading a work, I prefer index styles when there's repeated formatting throughout a work and I might change my mind partway through about what that formatting should be (like if I notice that those smaller all-caps lines were actually regular-sized all-small-caps). Is this understanding missing something?
I'm not clear on what "display layers" means in this context, nor on what you're saying interferes with people's ability to have their own CSS. I wouldn't expect inline styles vs. CSS to make much of a difference in that regard, so again, I'm probably missing something here. (And that something may be experience customizing websites' CSS beyond making minor improvements with Stylus.) —CalendulaAsteraceae (talkcontribs) 08:08, 8 March 2023 (UTC)Reply[reply]

Multiple missing images and interleaved blank pagesEdit

The body of "The Pictorial Flora; or British Botany Delineated" comprises alternating images and blank pages; all unnumbered.

Please can someone use a bot to set each image page to match file page 9 (with the page number in the template incremented appropriately) and the interleaved blank pages to match file page 10?

If you have a tool to build the page index to match, so much the better. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:08, 7 March 2023 (UTC)Reply[reply]

@Pigsonthewing: Three helpers
  1. use the script MediaWiki:Gadget-Without text.js <=without text (script enables clear and save of Page: ns pages
  2. we can write you a script for your sidebar to click to add {{raw image|{{subst:FULLPAGENAME}}}}
  3. For the Index: page there is also the available "even" and "odd" syntax that enables one to mark the pages as you require, see mul:Wikisource:ProofreadPage#The <pagelist/> tag.
Noting that we can just mark the empty page as empty, and have no need to do anything with them, neither mark them without text, and this way any transclusion will just ignore them as they don't exist. So skipped 1) and done 3). For 2) I await your guidance. Personally, I would not do anything and just add them when the images are ready. — billinghurst sDrewth 21:22, 7 March 2023 (UTC)Reply[reply]
Thank you. Every day's a school-day. When I asked previously about bulk marking pages as without text, I was referred here. My wish was to not have to visit every such page, even script assisted, to carry out repetitive task. "=empty" is useful, but it's a pity that it renders bogus page numbers on the index page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:57, 8 March 2023 (UTC)Reply[reply]
I tried that in The Pictorial Flora; or British Botany Delineated/001-132, but there are red links between several of the pates (but not the first few). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:06, 8 March 2023 (UTC)Reply[reply]
Index namespace is a work area, and the use of pagelist is to push the components through to transcluded namespace. It is not designed to be sexy, it is designed to show the workings. If you think that it needs improving for the page numbers marked empty, that is a job for phabricator: and the ProofreadPage extension. — billinghurst sDrewth 06:19, 11 April 2023 (UTC)Reply[reply]

Inserting {{default layout}}) for the main namespace pages I createdEdit

I am requesting to add to each main namespace I created the template {{default layout|Layout 4}}. I extracted a list of 10,771 unique titles from the main namespace list of contributions I created, excluding redirects and other anomalies. Is this file useful for this task? — ineuw (talk) 07:52, 13 May 2023 (UTC)Reply[reply]

I oppose this change. The default layout should be set based on what the work requires, not who created the page. It's theoretically possible that all the mainspace pages you created are best served by a single given layout, but it seems very improbable. In other words, this request appears to in effect be more about what your personal layout preference is than what the texts and our readers require. Xover (talk) 08:07, 13 May 2023 (UTC)Reply[reply]

Hello,

At the moment, the quote styles in this work are quite inconsistent, but I am not sure if it is easier to change to straight or curly. If it isn't too difficult, please convert to a consistent style of your choice. However, if the number of required changes is going to make the task quite difficult, I am okay with leaving things as is.

Thanks, TeysaKarlov (talk) 00:39, 28 May 2023 (UTC)Reply[reply]

I'm not a bot operator, but FWIW it's a lot easier to change to consistently use straight quotes. —CalendulaAsteraceae (talkcontribs) 04:48, 28 May 2023 (UTC)Reply[reply]