Wikisource:Community collaboration/Monthly Challenge/How it works

Monthly Challenge How it works
Description of how the Monthly Challenge infrastructure works

For information on how to administer the challenges without needing to know how the system works, see Administration.

The Monthly Challenge is largely automated and works with a set of templates, Lua modules and bots. The infrastructure may seem daunting but most parts of it are relatively simple in isolation.

DataEdit

Challenge worksEdit

The data for the challenges is stored in Lua data tables, arranged on an month-by-month basis. For example, August 2021 challenges draw their works from Module:Monthly Challenge/data/2021-08. Works can also be drawn from prior years' data tables when they roll over a year end (for example, they are added in December but are not completed by year's end).

The data structure for these tables can be seen at Module:Monthly Challenge/data/2021-08.

This table is completely manually-generated. Works are proposed for inclusion at Nominations.

StatisticsEdit

There is no built-in way to track statistics within the constraints of the MediaWiki software that Wikisource uses. Because of this, a program runs externally and updates statistics on-wiki in the form of Lua tables. This raw data is then used by templates and Lua modules to present the data to users.

The current proofreading progress of Indexes are written to modules like Module:Monthly Challenge category stats/data/2021-05

Daily statistics are written to modules like Module:Monthly Challenge daily stats/data/2021-05.

These statistical tables should not be manually edited, as that would be overwritten next time the stats update.

TemplatesEdit

Templates provide most of the "UI" of the Monthly Challenge. A complete list of the templates can be found at Category:Monthly Challenge templates.

Logic templatesEdit

  • {{Monthly Challenge listing}} this is one of the core modules and is used on every monthly overview page (for example: May 2021). This module takes the data from the works data table and analyses each index relevant to the current month's challenge. It then arranges the works into sections based on work length and proofreading status, and outputs "tiles" for each work.

There are some templates that provide more fine-grained views into the Monthly Challenge data:

UI templatesEdit

There templates provide parts of the UI for the Monthly Challenge

BotsEdit

Some parts of the Monthly Challenge cannot be achieved using only what MediaWiki provides through templates and modules. These parts are therefore done by scripts and bots outside the MediaWiki framework.

StatisticsEdit

Current proofread progress stats and per-challenge daily stats are provided by external program that analyses relevant revisions in the Wikisource database, computes the number of pages in the given statuses and writes to the data tables described above.

The statistics are gathered by analyzing the edits made to each page for each index in the month's category. There are some limitations and qualifications to how this data can be used:

  • The total number of pages (a figure that includes pages that don't exist yet) for an index is based on the data stored in the pr_index database table at the time the script is run. This count is used for all the stats, so if the file changes during the month, days before the change will also use the new count.
  • Edits will only be considered if they are made during the month. If a page is moved into a challenge work partway though, only edits made to that page since the start of the month count towards daily stats. The overall per-index stats will include the new statuses.
  • Works are included based on current category membership. If an index is removed from the category during the month, stats for previous days will be removed at the next update. If a work is added to the challenge, all edits made to its pages (during the month so far) will be included in daily figures from the start of the month.

The current proofread status is determined using the same list of revisions that the daily stats use (as opposed to the q0-4 fields of the pr_index table). There should be no difference in these figures, but since the database table requires the index to be purged to be up to date, there could be minor and temporary differences.

Statistics are generated according to server time, which is UTC. Thus, if you want to sneak in last-minute edits and you live in California, you only have until about 5pm your time before you miss the cut-off!