Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal

Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal (2022)
Marc Miquel-Ribé, Cristian Consonni and David Laniado
3950721Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal2022Marc Miquel-Ribé, Cristian Consonni and David Laniado

Article

Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal

Marc Miquel-Ribé1,* , Cristian Consonni2 and David Laniado2

Citation: Miquel-Ribé, M.; Consonni, F.; Laniado, D. Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal. Sustainability 2022, 14, 4705. https://doi.org/10.3390/su14084705

Academic Editors: Rosta Farzan, Amy Babay and Claudia López

Received: 20 February 2022
Accepted: 11 April 2022
Published: 14 April 2022

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/4.0/).

1Tecnocampus, Universitat Pompeu Fabra, 08302 Mataró, Catalonia, Spain
2Big Data & Data Science Unit, Eurecat-Centre Tecnològic de Catalunya, 08005 Barcelona, Catalonia, Spain; cristian.consonni@eurecat.org (C.C.); david.laniado@eurecat.org (D.L.)
* Correspondence: mmiquelr@tecnocampus.cat

Abstract: Wikipedia is an undeniably successful project, with unprecedented numbers of online volunteer contributors. After 2007, researchers started to observe that the number of active editors for the largest Wikipedias declined after rapid initial growth. Years after those announcements, researchers and community activists still need to understand how to measure community health. In this paper, we study patterns of growth, decline and stagnation, and we propose the creation of 6 sets of language-independent indicators that we call “Vital Signs”. Three focus on the general population of active editors creating content: retention, stability, and balance; the other three are related to specific community functions: specialists, administrators, and global community participation. We borrow the analogy from the medical field, as these indicators represent a first step in defining the health status of a community; they can constitute a valuable reference point to foresee and prevent future risks. We present our analysis for eight Wikipedia language editions, and we show that communities are renewing their productive force even with stagnating absolute numbers; we observe a general lack of renewal in positions related to special functions or administratorship. Finally, we evaluate our framework by discussing these indicators with Wikimedia affiliates to support them in promoting the necessary changes to grow the communities.

Keywords: Wikipedia; online communities; editor engagement; growth; renewal; decline; stagnation; indicators and Renewal.


1. Introduction
Wikipedia has reached its second decade of life, and it is one of the largest multilingual and collaborative free-knowledge repositories in the world. The project is undeniably successful along many dimensions: from its popularity and geographical spread to its adoption even among professionals and academia. Its online community has been the object of amazement and debate in equal parts.

Since its foundation in 2001, Wikipedia’s model has become referential for many other online communities. According to [1], online communities are shaped by the social interactions, the policies that guide them, and the design of the software they are using. Some projects can give rise to strong cultural norms that guide community members to create valuable outputs, be they an article in an encyclopedia or free software.

The sustainability of a project is often presented as a problem of growth at its initial stages since it demands heterogeneous contributors [2,3], with varying levels of resources and interests [4]. Wikipedia’s use of wiki technology lowered the barrier to participating in its online community, allowing anyone to register and modify articles anonymously. Wikipedia attracted many early volunteer editors thanks to its broad scope to make the sum of human knowledge freely accessible and shareable, an ethos that resonated with the hacking culture of the early years of the World Wide Web.

In 2005, along with growing popularity on the Internet, the number of registered editors on Wikipedia rapidly increased until obtaining a critical mass of participants in several languages, including English, French, Italian, Japanese, German, and Polish. Nonetheless, in 2007, the English Wikipedia peaked in the number of active editors (51,000) and started declining (e.g., 30,000 in 2014). Academic studies presented the overall decline in active editors on the English Wikipedia as a consequence of the trade-off between having to manage content quality and having massive participation, that led to a more closed system calcified against changes-especially those proposed by newcomers-in the form of policies, among other aspects [5–7].

In the face of those who predicted its end, Wikipedia has become more valuable over time. Its content has become the information backbone of the Internet. In addition, Wikipedia is fundamental for search engines that use it to improve their results and display articles from the encyclopedia prominently, thus sending readers to Wikipedia [8].

The success of Wikipedia led to the creation of other projects promoting other forms of free-and-open knowledge, called “sister projects (Wikipedia contributors, ’Wikipedia: Wikimedia sister projects’, Wikipedia, The Free Encyclopedia, 6 February 2022, 11:55 UTC, https://en.wikipedia.org/w/index.php?title=Wikipedia:Wikimedia_sister_projects&oldid=1070232453 [accessed 14 February 2022].)”: from Wiktionary, an online dictionary and thesaurus, to Wikidata, a collaborative and free knowledge base. This ensemble of projects and its participants, collectively known as the Wikimedia movement, has spread worldwide. Several non-profit entities promoting Wikipedia and free knowledge have appeared. (Meta contributors, ’Wikimedia movement affiliates’, Meta, 16 November 2021, 20:41 UTC, https://meta.wikimedia.org/w/index.php?title=Wikimedia_movement_affiliates&oldid=22343786 [accessed 14 February 2022].) The first one of these organizations to be born was the Wikimedia Foundation, based in the United States, which also hosts and runs Wikipedia’s servers. These local and thematic organizations are collectively called “Wikimedia affiliates” and they promote the Wikimedia projects by organizing activities and events and providing an interface for the Wikimedia movement with public institutions and the media.

Wikipedia proved that an online community could rally around a peer production project to create and grow content. However, questions around the health of the Wikipedia community at large remain open. For example, several community members advocate for more inclusive spaces and aim for a more diverse representation of editors and content [9]. We wonder what could help the movement to regulate itself, becoming aware of the state of growth or decline of the active Wikipedia language communities, and lowering the barriers to participation without losing the social structure that allows keeping quality standards.

We argue that we need a new set of metrics that can give a more nuanced picture of the health of a community. The picture for each community is more complex than what can be captured only by the number of active editors, and each project has a different history. For example, large language editions like English and German have experienced a decline and stagnation in the absolute number of monthly active editors, but the number of newly registered users every month remains stable, proving the initial interest of people to become Wikipedians. (Wikimedia Statistics-English Wikipedia-New registered users, https://stats.wikimedia.org/#/en.wikipedia.org/contributing/new-registered-users [accessed 16 February 2022].) Less established language editions like Arabic are instead still experiencing a growth phase, but other aspects related to the stability of the community may be of concern and can forecast upcoming problems. Furthermore, some editors cover special functions that are fundamental for the communities, such as administratorship or technical editing, and may deserve a special attention.

Thus, we propose three objectives to study the sustainability of the active Wikipedia language communities:

  • Objective 1 [O1]: assess the growth, stagnation, decline patterns in the history of Wikipedia language communities.
  • Objective 2 [O2]: design a set of indicators to capture the degree of growth and renewal within communities.

  • Objective 3 [O3]: validate the indicators with results from a sample of language communities and explore their potential role in affiliate planning.

To date, there is no comprehensive study on community growth and sustainability aimed at understanding the Wikipedia communities accounting for their linguistic diversity. Visual tools have been developed for exposing different aspects of social interactions in Wikipedia, including edit histories [10], controversies [11], language gap [12], cultural diversity [13]; however, we find a lack of tools to describe the state and health of a community, beyond the basic statistics available on the Wikimedia servers (Wikimedia Statistics, https://stats.wikimedia.org/ [accessed 24 March 2022].), and tools dedicated to monitoring participation to specific events (Event Metrics, https://eventmetrics.wmflabs.org/ [accessed 24 March 2022].) or outreach programs. (Programs and Events Dashboard, https://outreachdashboard.wmflabs.org/ [accessed 24 March 2022]).

In this paper, we present a comprehensive framework to look at the evolution and current state of the active editor communities through a list of language-independent indicators we call “Vital Signs.” These indicators are computed from the digital traces left by editors on Wikipedia, and presented in user-friendly visualizations. Furthermore, we validated their usefulness, by discussing them with community members in various Wikimedia conferences, and we set target values aimed at guiding the implementation of projects that can help improve community vitality and sustainability.

Our main contributions in this paper are that:

  • We study the evolution of the number of editors over time in the 50 major language editions, and identify groups of language editions following different patterns of growth, stagnation and decline;
  • We propose a set of indicators (Vital signs) to assess and monitor community health, based on previous literature and community practices;
  • We present the results obtained for a set of eight selected language editions, for which we received feedback from Wikimedia affiliates;
  • We describe the iterative validation process based on holding focus group sessions after presentation in four Wikimedia conferences and two dedicated meetings with Wikimedia affiliates;
  • We discuss implications for the Wikimedia movement and how targets based on the Vital Signs could be adopted by affiliates and integrated into their annual planning.

Figure 1 illustrates the different phases of the methodology followed in this study, in order to fulfill the three objectives introduced above. In the first phase, “Exploration”, on one hand we study growth, stagnation and decline patterns with data from the 50 largest communities, and on the other hand we revise previous literature and community practices to identify key aspects to be measured and monitored, and incorporated in our framework. In the second phase, “Design and implementation”, we define the Vital signs as a set of indicators to monitor community health, and show the results on a sample of language editions. In the third phase, “Validation”, we assess the proposed framework and results with focus groups and feedback from Wikimedia affiliates and communities. The approach is described in more detail in Section 3.

Figure 1. Diagram representing the phases of the work presented in this paper, with the research objectives.

The paper is organized as follows: in Section 2 we present the background of this study by diving in different aspects of the Wikipedia language communities that we measured. In Section 3, we present the approach and we define how we measure the growth and renewal of the communities and how we set-up the validation for the final indicators. In Section 4, we present the results of the measurements for a set of eight languages and the feedback provided by the language communities and affiliates. In Section 5, we discuss the main results as well as potential uses of the indicators, to finally state the limitations of the study. Finally, in Section 6, we draw some conclusions.

2. Background
2.1. Communities Decline and Stagnation

The number of contributors to the English Wikipedia grew until March 2007, when it began to rapidly decline until 2014 [14,15]. Other language editions like German, Spanish, Italian or French stopped their growth between 2007 and 2011. Hill and Shaw [16] found that other active peer production communities (open-source software commons) often experience a period of rapid growth followed by stabilization.

Stagnation in terms of the number of active editors in the English Wikipedia has become common in general and specialized press for more than a decade. It has been taken for granted as much as Wikipedia’s value to society. Even though it appears that Wikipedia communities have a delicate but basically sound constitution, we do not know the risks involved in stagnation.

In this Section, we want to understand how past research explains the factors related to this decline in the number of active editors. Ultimately, our first objective is to describe the state of the active community for all languages, whether it is growing, stagnant, or declining.

2.1.1. Bureaucracy and Openness

One desirable explanation to why Wikipedia communities peak and decline would be that there comes a time in which most of the current knowledge has already been documented in the form of articles and structured data. However, we argue that this is not the case, given the diverse coverage of topics and the size of the different Wikimedia projects with a stagnant community. On the contrary, the wideness of accepted topics might provide a reason for editors to stay. Asatani et al. [17] suggest that communities allowing free-topic do not tend to collapse (i.e., to lose their users suddenly).

While topic availability or exhaustion can hardly explain this decline, observed for encyclopedias of very different sizes, we might think that the number of potential contributors in a society plays a role in community growth. Even though this might be true for small languages, on the other hand, we see that some large linguistic communities like German or French exhibit the same stagnation pattern. Therefore, we might incline towards thinking that, excepting the case of a political ban of the site or censorship, the main reasons that can explain community stagnation or decline are not external but due to internal dynamics.

Following this argumentation, several researchers [5,7] investigated the relationship between the bureaucratic and technical structure of Wikipedia and the decrease in the number of new editors. They found that the policies and algorithmic tools that serve as quality control mechanisms often reject the contributions of newcomers, with a lower retention rate being an indirect effect. Gorbatai [18] found that novice contributors might have a negative direct impact on quality, but their participation also motivates experts to contribute and increase the quality of the good, thus mediating the relationship between them and the final outcome.

Hill and Shaw [16] explored the trade-off between quality and openness on other collaborative and peer production projects by studying 740 wikis hosted by Fandom/Wikia over the first five years of each wiki’s history. They observed similar lifecycle dynamics to the English Wikipedia and the decline of its contributor base. They concluded that peer production projects’ decline is mainly a function of existing members turning away newcomers through an evolution of the ecosystem, which becomes more closed to collaboration.

Bureaucratic closure provides a way to safeguard the goods that communities have built. While Wikipedia advocates for openness and flexibility, the necessary bureaucracy to maintain it is multi-faceted in policies, roles, and decision-making processes usually based on the consensus of those editors who follow the dedicated spaces within the wiki [19]. Shaw and Hill [20] studied 683 other wikis and concluded that peer production projects do not function like “laboratories of democracy” but tend to the “iron law of oligarchy”, which states that organizations become large and complex, and a small group of early members consolidate and exercise a monopoly of power within the organization. On the contrary, Konieczny [21] studied the decision-making processes of Wikipedia and concluded that many factors are preventing or slowing the development of an oligarchy on Wikipedia. Rijshouwer et al. [22] saw that while power concentration and bureaucratization are apparent outcomes, various cases show that general community members appear to be concerned about it and contest it. Whether power is concentrated in a few hands or more distributed, the current content quality management and bureaucratization have an adverse effect on welcoming newcomers, engaging younger generations in the project, and including peripheral editors. Assuming this trade-off, we observe a lack of a conscious and explicit assessment of the balance between the two goals in the community decision-making processes.

2.1.2. Barriers to Entry

The growth in the number of policies, guidelines, and documentation has been reported by several studies [6,19,23]. This increased complexity is considered a cost with a negative impact on production [14], but other factors related to the required technical skills or the lack of usability also matter. Literacy in technical design is a long-studied issue since some reports consider that the usability of the MediaWiki technology has significant room for improvement.

For instance, in the field of education, Raitman et al. [24] used a wiki and concluded that it had a poor interface and was cluttered. In higher education, Ebner et al. [25] experimented with the use of wikis to engage students. Even though many pedagogical factors influenced the students’ performance, the study concluded that a wiki was not a proper tool for assignments. Among the answers, bad usability appears as one of the barriers and therefore a potential reason for students’ low level of editing.

In 2012, a MediaWiki extension named VisualEditor (MediaWiki contributors, ’Extension:VisualEditor’, MediaWiki, 7 February 2022, 12:23 UTC, https://www.mediawiki.org/w/index.php?title=Extension:VisualEditor&oldid=5058123 [accessed 19 February 2022]) was released to provide a “What You See Is What You Get” (WYSIWYG) editor, which allows editing the same way as writing in a word processor, visualizing the results ately. Even though this VisualEditor allows the user to edit without necessarily having to deal with the wikitext code, it does not prevent editors from accessing and modifying directly the wiki markup code if they prefer. Nonetheless, to date, in many Wikimedia projects including English Wikipedia, the VisualEditor is not set as the default editor and needs to be manually set as a personal preference (Wikipedia contributors, ’Wikipedia:VisualEditor’, Wikipedia, The Free Encyclopedia, 28 January 2022, 22:22 UTC, https://en.wikipedia.org/w/index.php?title=Wikipedia:VisualEditor&oldid=1068523898 [accessed 19 February 2022]), thus its existence may go unnoticed for the newcomer.

More recent studies such as Gluza et al. [26] assessed user experience in editathons by using participatory observation and found that participants experience moments of frustration, especially concerning the usability of the editor and when navigating a complex bureaucracy of policies and procedures. This was also related to gender gap, as Hargittai and Shaw [27] explained the lack of women in the community is worsened by a technical skills gap. This means that people’s background and knowledge prior to learning about Wikipedia literacy are critical factors that determine whether they will be able to succeed in this process, or frustration will lead them to abandon.

Cowan [28] states that feeling judged by other peers in a wiki creates concerns for new editors, who may be anxious about the quality of their contributions, in particular about their accurateness or validity, in front of the entire community or even readers. Not only feeling judged, and having one’s edits removed (reverted) are powerfully demotivating– most specifically, new editors are reverted by much more experienced editors [29]. The opposite is also true; community spaces like the “Teahouse” in the English Wikipedia and their equivalents in other languages serve as a place where newcomers can ask more experienced editors questions on any topic, from the process of contributing content, to the use of their personal User Pages, and may help to encourage new editors’ participation [30].

In an experiment, Morgan and Halfaker [31] found that new editors invited to the Teahouse are retained at a higher rate than editors who do not receive an invitation, even when controlling for the newcomer’s activity and survival. The Wikimedia Foundation has been working with different teams over the past years (2014 Growth (MediaWiki contributors, ’Growth/Growth 2014’, MediaWiki, 22 August 2021, 09:43 UTC, https://www.mediawiki.org/w/index.php?title=Growth/Growth_2014&oldid=4775153 [accessed 19 February 2022]); 2017–2018 New Editor Experiences (MediaWiki contributors, ’New Editor Experiences’, MediaWiki, 25 January 2022, 14:02 UTC, https://www.mediawiki.org/w/index.php?title=New_Editor_Experiences&oldid=5036441 [accessed 19 February 2022]); 2018–2022 Growth Team (MediaWiki contributors, ’Growth’, MediaWiki, 1 February 2022, 18:44 UTC, https://www.mediawiki.org/w/index.php?title=Growth&oldid=505095 5 [accessed 19 February 2022]) to understand the first days of an editor and personalize the newcomer experience. Currently, the product teams like “Growth” are aimed at creating new structured tasks to retain newcomers. These tasks usually provide a more guided experience aimed at learning how to edit, solving minor content issues, and requesting attention from a more experienced editor.

In summary, community decline and stagnation have been largely assumed as an endemic problem to Wikipedia. In this subsection, we have explored the factors that are related to it and that prevent the entrance of newcomers, according to previous literature. Given the focus on the English Wikipedia and other major languages in previous studies, we aim to verify the state of the active communities for more language editions, in order to achieve a global view. For this reason, our Objective 1 [O1] is to assess the growth, stagnation, decline patterns in the history of Wikipedia language communities.

2.2. Community Health and Renewal

As early as 2009, the Wikimedia movement started a strategy process, which included a task-force dedicated to debate about the state of what they called Community Health (Strategic Planning contributors, ’Task force/Community Health’, Strategic Planning, 15 August 2010, 17:45 UTC, https://strategy.wikimedia.org/w/index.php?title=Task_force/ Community_Health&oldid=73740 [accessed 19 February 2022]). This new term was coined to discuss aspects related to burn-out and editing fatigue, as well as the potential impact of reverts and community norms on the recently detected decline in the number of contributors.

Studying the state of the community through the lenses of “health” allowed looking for solutions in order to apply them as treatments. Projects like the Community Health Initiatives Metrics Kit (Meta contributors, ’Community health initiative/Metrics kit’, Meta, 1 January 2022, 07:46 UTC, https://meta.wikimedia.org/w/index.php?title=Community_health_initiative/Metrics_kit&oldid=22521662 [accessed 19 February 2022]). led by the Wikimedia Foundation Trust and Safety team attempted to design a set of metrics (2016–2017), although they were never implemented. Later, the work was redirected towards the development of anti-harassment tools (Meta contributors, ’Anti Harassment Program’, Meta, discussion about Wikimedia projects, 16 February 2022, 17:40 UTC, https://meta.wikimedia.org/w/index.php?title=Anti_Harassment_Program&oldid=22848016 [accessed 19 February 2022]). and the study of toxic language (Meta contributors, ’Research:Detox’, Meta, discussion about Wikimedia projects, 22 May 2020, 23:00 UTC, https://meta.wikimedia.org/w/index.php?title=Research:Detox&oldid=20109435 [accessed 19 February 2022]).

Community health has become a strategic priority for the Wikimedia movement also in the Wikimedia strategy process 2030 (Meta contributors, ’Strategy/Wikimedia movement/2018-20/Working Groups/Community Health’, Meta, discussion about Wikimedia projects, 21 June 2021, 16:57 UTC, https://meta.wikimedia.org/w/index.php?title=Strategy/Wikimedia_movement/2018-20/Working_Groups/Community_Health&oldid=21624042 [accessed 19 February 2022]). The Community Health Working Group discussed for more than two years to create recommendations to elaborate a Universal Code of Conduct, among other initiatives that would possibly improve editor diversity and retention. However, we believe that, while propositive solutions are essential, there is a need to have clear measurements that allow assessing progress.

In this subsection, we explain the importance of measuring community growth as a matter of project sustainability and set the ground for designing a set of indicators that describe the state of the active community.

2.3. Indicators for Community Growth and Renewal

There are some implicit risks in not planning for community to thrive. Measuring the state of the active community is not only important to understand its potential growth but also the risks of such sudden decline. Because even when there is no growth, and there is stagnation, there could be community renewal. In fact, lack of renewal would be worse, as it would mean that the end of the active editors’ lifecycle would result into the disintegration of the community.

The most valuable metric to understand the size of the community is the number of active editors. These are those editors who made at least 5 edits in a month (MediaWiki contributors, ’Analytics/Metric definitions’, MediaWiki, 24 October 2021, 11:36 UTC, https://www.mediawiki.org/w/index.php?title=Analytics/Metric_definitions&oldid=4884329 [accessed 19 February 2022]). Very often, the community of active editors is regarded as the real community. However, there are no indicators to understand the state of renewal of the active community, in general, as well as of those parts of the community which perform specific functions.

Regarding growth and renewal for the entire community of active editors, we identified the need to measure three specific aspects: (a) retention of newcomers, (b) stability (monthly variation of editors), (c) balance between generations of editors. In addition, given that the development of Wikipedia requires other tasks than editing articles, we also consider it necessary to measure growth and renewal for specific functions or subparts of the community: (d) special functions (technical editors and coordinators), (e) administrators and other flags, (f) global participation into the movement spaces.

We briefly review the definition of these six aspects given by current research and within the Wikimedia Movement organizations and communities. In order to ensure the applicability of the indicators, we propose one research question for each of them.

2.3.1. Retention

Editor retention refers to the capacity of a Wikipedia language community to retain the new registered users (Meta contributors, ’Research:Editor retention’, Meta, discussion about Wikimedia projects, 14 January 2021, 08:15 UTC, https://meta.wikimedia.org/w/index.php?title=Research:Editor_retention&oldid=20958279 [accessed 19 February 2022]). It is a common measurement among open collaboration projects and platforms overall [32]. Sometimes it is also referred to as “editor survival.” In Wikimedia, it is often measured as the percentage of new editors who edit again after 60 days from the first edit [5] (Meta contributors, ’Research:Surviving new editor’, Meta, discussion about Wikimedia projects, 14 March 2018, 22:22 UTC, https://meta.wikimedia.org/w/index.php?title=Research:Surviving_new_editor&oldid=17835590 [accessed 19 February 2022]).

Wikimedia affiliates like Wikimedia Hungary have specifically created programs to improve retention (Meta contributors, ’Grants:Project/WM HU/Editor retention program’, Meta, discussion about Wikimedia projects, 5 September 2019, 21:31 UTC, https://meta.wikimedia.org/w/index.php?title=Grants:Project/WM_HU/Editor_retention_program& oldid=19355430 [accessed 19 February 2022]). Wikimedia Foundation has measures of community retention, but they are not available to Wikimedians in the form of a tool or a regularly updated table.

  • RQ1 [Retention] How do Wikipedia communities retain new members over time?

2.3.2. Stability

Community stability refers to the capacity of a Wikipedia language community to maintain a core of active editors over time. While retention increases the number of active editors, stability implies that a substantial part of the active community is the same on a monthly basis. This is especially relevant in order to sustain long-term activities, such as the development of Wikiprojects focused on specific topics.

Opposite to a stable community, a very volatile one changes the active editors every month. While this can be seen as a sign of renewal, we may also see the possibility of a sudden drop in participation and incapacity to sustain activities such as covering topics related to the news. Community stability is simply the aggregation of individual measurements of editor loyalty, i.e., their recurrence over time. This is related to user loyalty, commonly measured in many platforms and websites [33].

To date, there is no measurement of community stability in prior research nor available to Wikimedians.

  • RQ2 [Stability] How stable is the composition of Wikipedia communities over time?

2.3.3. Balance

Community balance refers to the capacity of a Wikipedia language community to incorporate new members and to maintain the old ones to reach a balanced composition of different generations of editors. Differently from stability, measuring the balance in terms of different tenure, we want to see whether the community is able to preserve the experience of those long-term engaged and at the same time acquire new editors’ freshness and energy. Balance falls between retention and stability, since the first one measures the capacity to attract and retain newcomers, and the second focuses on the sustained engagement. It is likely that communities that are able to retain new editors and sustain the engagement of more experienced ones will present a balanced distribution of the different generations.

With balance, we expect to see the state of renewal of the active community and therefore detect excess in volatility, with potential risks of sudden drop-out, or lack of renewal, with the risks of a closed and not growing community. Therefore, it is the most valuable aspect to measure in terms of sustainability.

  • RQ3 [Balance] How balanced are Wikipedia communities in terms of including new members over time and maintaining the old ones?

2.3.4. Special Functions

Community special functions refer to those non-editing activities which are essential to growing the Wikipedia language edition, like, for instance, technical editors and coordinators. Technical editors are those who create the bots, the templates, and even do some changes to the platform. The bots are automatized users that have become an essential taskforce to perform massive changes and to clean undesired vandalized content [34–36].

As coordinators we consider those editors who develop tasks that do not directly relate to an article but to an entire topic or the project as a whole. They are editors who participate in the community discussions, create the Wikiprojects, and organize the lists of articles to be created or design the course of action to vote on a policy change or organize an event. They would act in Portal pages, which serve as spaces for broad subjects, or in Wikipedia pages, which tend to be dedicated to discussions or information about Wikipedia itself (e.g., policies, conventions, manual of style, etc.).

  • RQ4 [Special functions] How are Wikipedia communities renewing their technical development (Techwizard) and project specialists?

2.3.5. Administrators

Community administrators or, more generally, “user having special rights” refers to the community flags that are granted to editors to undertake some special functions. These flags allow special actions such as blocking an article or a user, deleting a page, and using certain automated tools. There are different types of flags that can be ordered by their level of rights [37] being “sysop” or “administrator” the most known among them (Wikipedia contributors, ’Wikipedia:Administrators’, Wikipedia, The Free Encyclopedia, 26 January 2022, 19:09 UTC, https://en.wikipedia.org/w/index.php?title=Wikipedia:Administrators&oldid=1068121199 [accessed 19 February 2022]).

Administrators assume these responsibilities as volunteers and undergo a community review process based on their previous contributions and attitudes [38]. In many cases, editors obtain different flags over time and transition to higher ones [37]. Having an administrator flag implies a certain level of responsibility and commitment to some content quality patrolling activities, given that inactivity is punished with the removal of the flag. Administrators have some duties, but at the same time, they have some influence in any community matter.

  • RQ5 [Administrators] How are the Wikipedia communities granting admin user rights (e.g., roles such as sysops) to new members?

2.3.6. Global Participation

Global participation refers to the capacity of a Wikipedia language community to engage in global conversations and spaces usually held in Meta-wiki as well as to attract members from other communities while having a strong local base of contributors. To date, there is not any measurement on the composition of Meta-wiki contributors. However, it is assumed that it is representative of the movement, given that all the strategy conversations which are held on Meta-wiki pages decide on aspects that are incumbent on the whole Wikimedia movement.

As far as cross-wiki editing is concerned, Hale [39] found that multilingual editors exist in all language editions, but smaller-sized editions with fewer users have a higher percentage of multilingual people. However, what is unknown is the extent of editors in each Wikipedia language edition who do not have that language edition as their main project. This is important as it relates to the stability of the language edition, given that multilingual editors may not have the same level of commitment and could stop editing that language edition at any moment.

  • RQ6 [Global - Meta-wiki] How are Wikipedia language communities participating in global projects spaces (Meta-wiki)?
  • RQ7 [Global - Local] What is the composition of Wikipedia language communities in terms of multilingual editors?

Focusing on these six aspects, we will measure growth and renewal for the editor community and its subcommunities covering specific functions and propose some specific indicators.

2.3.7. Roadmap

In summary, community decline and stagnation pose a significant risk for project sustainability. In this section, we have identified 6 aspects that capture the general and specific functions Wikipedia communities and subcommunities undertake to ensure their growth and renewal. Even though there have been several studies and Wikimedia Movement initiatives focusing on community health, neither the academic sphere nor the Wikimedia communities can anticipate the risks of a community decline. For this reason, our Objective [O2] is to design a set of indicators to capture the degree of growth and renewal within communities.

2.4. The Role of the Wikimedia Movement Affiliates

In this section, we want to give an overview on the Wikimedia Movement and how each actor relates to community growth and renewal. There are three different actors with different rights and capacities. Firstly, the communities of volunteers who edit the content. Secondly, the Wikimedia affiliates, are local organizations with some geographical or thematic scope with the aim of supporting one or more Wikipedia language editions or Wikimedia projects. Wikimedia affiliates are composed of editors, whose engagement does not only imply editing, but also volunteering in creating local activities (e.g., workshops known as editathons, importing catalogues from museums, etc.). Finally, the third one, the Wikimedia Foundation, is responsible for product development and general support to the movement.

On the one hand, at a community level, we explained the difficulty of setting community growth and renewal as priorities for community members. Editors’ activities are centred on creating new content and improving the existing one when they are not fixing errors or vandalism. As seen in previous research, there is a trade-off between focusing on quality control and staying open. Low editor retention can be explained in the calcification of rules, but also negative social interactions such as edit reverts and lack of usability in the system. Community members decide which interface changes to accept and which policies need to be modified. Since their focus is on content quality, they may not be aware of the low retention and the impact of not enabling changes.

On the other hand, at the Wikimedia Foundation level, there has been active research on editor engagement and retention for more than a decade. Different teams have proposed some interface changes and improvements that proved to raise retention but were never implemented at large scale. Some other tools and initiatives were specifically related to ensuring trust and safety and improving general community health. Likewise, at the level of the strategy of the Wikimedia movement, community health and growth have been largely discussed and set as goals priorities in different occasions from 2009 until 2020, in the form of recommendations and principles (e.g., being people-centered (Meta contributors, ’Strategy/Wikimedia movement/2018-20/Recommendations/Movement Strategy Principles’, Meta, discussion about Wikimedia projects, 27 June 2021, 22:34 UTC, https://meta.wikimedia.org/w/index.php?title=Strategy/Wikimedia_movement/2018-20/Recommendations/Movement_Strategy_Principles&oldid=21653121 [accessed 19 February 2022].). Movement Strategy conversations are held by Wikimedia Foundation staff, Wikimedia affiliates members and staff, as well as community members.

Nevertheless, having established these goals at the Wikimedia-movement level is not effective if the communities do not embrace growth and renewal strategies as an important matter and enable the changes that effect them. In this sense, we believe the indicators we suggest will shed light on the sustainability of the current community and will become a reference baseline. The indicators will bridge the gap between the goals and narrative provided by the Strategy and the changes that may come at Wikimedia Foundation and affiliate level. In fact, Wikimedia affiliates have the capacity to influence community discussions and facilitate these changes. By means of annual plans, they are able to create initiatives and set the necessary spaces in order to raise awareness on the need to perform some actions aimed at renewing and growing the community. For this reason, our third objective is to validate the indicators and explore their potential role in affiliate planning and agenda-setting. For this reason, our Objective 3 [O3] is to validate the indicators and explore their potential role in affiliate planning.

3. Approach

In this section, we present the approach we employed to reach the three goals, and we introduce the definition of six “Vital Signs,” each of which is associated with one or more indicators that we propose for assessing and monitoring a different aspect of community sustainability.

3.1. Research Process

This research project originated as a Wikimedia project grant to provide valuable indicators to understand the state of community health and recommendations in order to improve it (Meta contributors, ’Grants:Project/Eurecat/Community Health Metrics: Understanding Editor Drop-off’, Meta, discussion about Wikimedia projects, 7 May 2021, 10:10 UTC, https://meta.wikimedia.org/w/index.php?title=Grants:Project/Eurecat/Community_Health_Metrics:_Understanding_Editor_Drop-off&oldid=21432221 [accessed 19 February 2022]). This differentiates the approach from other projects created in organizations with a specific target user. In this case, the project was approved thanks to the support from the Wikimedia affiliates and community members, who will be the users to benefit from it.

For this reason, we decided to follow an open research model approach for this project, given that it is the most convenient way to engage with Wikimedia communities. This implies sharing the results, the prototypes work-in-progress, as well as the code and data at all times, presenting preliminary work at community gatherings, and discussing the work with relevant members from communities in order to get iterative feedback [13]. This is completely in line with the Wikimedia movement ethos, which encourages being bold and improving on things incrementally.

The open research approach is iterative. We have employed three different phases: (a) Exploration, (b) Design, and (c) Validation.

3.1.1. Phase 1: Exploration

In this initial phase, we reviewed the state of the art for the Wikipedia communities’ participation and organization. Furthermore, we did not only take into account the academic literature but also the Wikimedia documentation provided in the website Meta-wiki, which is the global site for the project’s information. To understand the state of growth and renewal of the communities, we would first assess the overall trajectory in the number of monthly active editors in their entire history [O1].

To do so, we obtained the corresponding time series, and normalized the values according to the maximum number of monthly active editors for each language edition. We then grouped the time series into clusters using the k-means algorithm with dynamic time warping [40]. This algorithm allows aligning the time series and grouping those language editions with a similar temporal pattern, regardless of minor oscillations and the exact moments in which the curve changed, focusing on the general trend. Hence, by looking at the distribution of the different clusters, we would be able to identify groups of languages following different patterns and phases of growth, decline and stagnation.

3.1.2. Phase 2: Design and Implementation

In this second phase, we designed the indicators to capture each of the aspects related to community renewal and growth we found valuable during the exploration phase (retention, stability, balance, special functions, administrators, and global) to reach [O2]. To do so, we bore in mind three different design principles: (1) following established community definitions, (2) keeping it simple and designing one or two indicators for each aspect, (3) being consistent across the different metric representations to facilitate interpretation.

Then, we proceeded with the data analysis through a visual analytics library tool. We implemented prior versions of a dashboard and compared different visualizations for each aspect. From a quick interpretation of the graphs, we estimated an initial potential “target value” for each of the indicators that we considered a good baseline for a healthy community in the process of renewing or growing. These “target values” would facilitate the interpretation of the graphs and would need to be validated by communities.

The design and analysis subphases were repeated on five different occasions for specific groups of languages. We identified the main Wikimedia conferences where we would be able to interact with Wikipedians. These are: Wikiindaba (African languages) (Meta contributors, ’WikiIndaba conference 2021’, Meta, discussion about Wikimedia projects, 1 January 2022, 11:22 UTC, https://meta.wikimedia.org/w/index.php?title=WikiIndaba_conference_2021&oldid=22522029 [accessed 19 February 2022]). Wikimedia CEE (Central and Eastern European languages) (Meta contributors, ’Wikimedia CEE Online Meeting 2021’, Meta, discussion about Wikimedia projects, 7 November 2021, 15:31 UTC, https://meta.wikimedia.org/w/index.php?title=Wikimedia_CEE_Online_Meeting_2021&oldid=22306903 [accessed 19 February 2022]). WikiArabia (Arabic languages) (Meta contributors, ’WikiArabia/2021’, Meta, discussion about Wikimedia projects, 5 January 2022, 11:52 UTC, https://meta.wikimedia.org/w/index.php?title=WikiArabia/2021&oldid=22538058 [accessed 19 February 2022]). and the Viquitrobada (Catalan Wikipedia gathering) (Wikipedia collaborators, ’Viquiprojecte:Viquitrobada 2021’, Viquipèdia, 21 November 2021, 18:35 UTC, https://ca.wikipedia.org/w/index.php?title=Viquiprojecte:Viquitrobada_2021&oldid=28637677 [accessed 19 February 2022]). We also performed an in-depth analysis for Wikimedia Poland and Wikimedia Italy, which were previously interested in understanding the potential risks for community dissolution, particularly for certain profiles of editors from the active community (i.e., coordinators and technical editors).

In the analysis of these groups of languages, we would acknowledge that even though there exist 308 language editions, those language communities under 50 active editors would not show applicable results since their situation would be too irregular to consider a pattern of growth or renewal. For the sake of simplicity, in this study, we will limit the graphs to a reduced number of eight languages selected among those with which we received feedback and reflective of the different active community sizes. These can be classified into four groups: communities containing more than 10, 000 active editors (English (English Wikipedia website, https://en.wikipedia.org/ [accessed 19 February 2022].), more than 5000 (German (German Wikipedia website, https://de.wikipedia.org/ [accessed 19 February 2022]). Italian (Italian Wikipedia website, https://it.wikipedia.org/ [accessed 19 February 2022]). Arabic (Arabic Wikipedia website, https://ar.wikipedia.org/ [accessed 19 February 2022]). and Polish (Polish Wikipedia website, https://pl.wikipedia. org/ [accessed 19 February 2022].), more than 500 (Catalan (Catalan Wikipedia website, https://ca.wikipedia.org/ [accessed 19 February 2022]), and more than 50 (Afrikaans (Afrikaans Wikipedia website, https://af.wikipedia.org/ [accessed 19 February 2022]). and Swahili (Swahili Wikipedia website, https://sw.wikipedia.org/ [accessed 19 February 2022]). These languages also reflect geographical diversity, and, at the same time, they have achieved a minimal critical mass in the number of active editors.

3.1.3. Phase 3: Validation

In this third phase, we presented the set of indicators and visualizations to an audience of Wikimedians. In total, we participated in the four conferences we mentioned in the previous section, where we were allocated from 15 minutes to 1 hour for presentations. These were held between August and November 2021. Each conference session had time allocated for questions and two of them led to focus groups sessions which we facilitated following a focus group protocol [41,42]. Two additional sessions were organized for the Volunteer support network, a dedicated group of people interested in supporting Wikimedia volunteers in different communities (Meta contributors, ’Connect/Volunteer supporters network’, Meta, discussion about Wikimedia projects, 18 November 2019, 21:20 UTC, https://meta.wikimedia.org/w/index.php?title=Connect/Volunteer_supporters_network&oldid=19568945 [accessed 19 February 2022]).

These sessions are useful for getting people’s perceptions and attitudes about any particular concept. In each group discussion and focus group, rather than engaging in discussion, we preferred setting the debate and then reminding that the indicators were a work in progress and that their input would be very valuable. We specifically asked community members four different things:

  1. To discuss the usefulness of each metric;
  2. To give any suggestion to improve the analysis (e.g., setting a different target value, or modifying some aspect of the data visualization);
  3. To consider the actor or actors who can take more responsibility in improving this Vital Sign for your Wikipedia language edition. Options are the affiliate, Wikimedia Foundation, and specific groups of editors;
  4. To explain which actions could help improve the current situation in relation to the target in the short or midterm horizon.

From each of the sessions, notes were taken and stored for interpretation among the researchers. Some comments were literally transcribed, while some ideas were written down in order to improve on the next iteration of metric design. During the interpretation of the notes, words, context, and duration of each part of the discussion were taken into account in order to assess their value.

3.2. Definition of the Indicators

We propose the definition of 6 indicators that we call “Vital Signs” to reach objective [O2] of exploring the degree of renewal within communities. In the following, we refer to each vital sign as an indicator, although in some cases, a vital sign may be composed of more than one indicator. In medicine, vital signs indicate the status of the body’s vital, life-sustaining functions. These measurements are taken to help assess the general physical health of a person, give clues to possible diseases, and show progress toward recovery.

In the case of Wikipedia, Vital Signs are related to the community’s capacity and function to grow or renew itself. Three of them are focused on the entire group of active editors creating content: retention, stability, and balance; the other four are related to more specific community functions: admins, specialists, and global community participation. We believe that obtaining values for the capabilities of the current active community of editors in these areas can constitute a reference point to plan to guarantee transparency and openness in these areas, to observe growth and renewal, and at the same time, to foresee and prevent future risks.

Based on our interpretation of the data for a wide array of Wikipedia language editions, we estimated a set of target values for each of the Vital Signs of what it might constitute a healthy, renewing community with growth potential. We think that these targets are reasonable for established communities (more than 50 active editors per month) who have achieved a critical mass that allows them to be sustainable in all the necessary tasks to develop a Wikipedia language edition. As these target values may be more easily interpreted and understood in comparison to the current values, we show them in Section 4, along with the results obtained for the selected analyzed communities.

3.2.1. Retention

RQ1 [Retention] How do Wikipedia language communities retain new members over time?

The first vital sign is retention. It reflects the capacity of the community to engage new editors to continue editing after they register.

  • Retention rate: this indicator is computed, according to the state of the art [5], as the percentage of new editors who edit at least once 60 days after their first edit.

3.2.2. Stability RQ2 [Stability]

How stable is the composition of Wikipedia language communities over time?

The second vital sign is stability. Community stability is the persistence of active editors. It is not only desired to ensure that there are fresh editors every month who had not edited on the previous month, but also that there are others who have edited for many more months.

  • Stability: computed as the number of active editors by the number of months they have been active in a row. We have grouped the number of active months in six groups: first-month editing (regardless of whether the user had previously edited and then taken a break), two months editing in a row, three to six months, seven to twelve, thirteen to twenty-four, and more than twenty-four months in a row. Looking at the proportion of users who fall in each of these groups, it is possible to get a picture of the stability (or volatility) of a community.

3.2.3. Balance

RQ3 [Balance] How balanced are Wikipedia language communities in terms of including new members over time and maintaining the old?

The third vital sign is balance. Community balance has to do with being able to maintain an equitable proportion of old and new editors. The community should benefit from the experience of older generations of editors, and at the same time, be able to stay open to new generations. This is a key sign related to renewal. It may not be desirable that the “productivity” relies too much on an older generation, but neither that it would depend mostly on the last one.

  • Balance: this indicator is given by the number and percentage of very active editors by year and by generation, defined as the lustrum of their first edit.

According to the MediaWiki definition (MediaWiki contributors, ’Analytics/Metric definitions’, MediaWiki, 24 October 2021, 11:36 UTC, https://www.mediawiki.org/w/index.php?title=Analytics/Metric_definitions&oldid=4884329 [accessed 19 February 2022]). we consider a very active editor “a registered (and signed in) person (not known as a bot) who makes 100 or more edits per month in mainspace on countable pages.” Given the imbalance of contributions, they usually account for the vast majority of the registered editors’ edits; for example, in the case of the Italian Wikipedia, the 85% of the registered editors’ edits every year are performed by very active editors. Therefore, by taking the very active editors, we are observing the part of the community that is responsible for the production of most of its content.

3.2.4. Special Functions

RQ4 [Special functions] How are Wikipedia language communities renewing their technical development (Techwizard) and project specialists?

The fourth vital sign is special functions. Community technical and coordination functions undertaken by editors are essential for the project. We, therefore, define two indicators for these two different functions:

  • Very active technical editors: the number of very active editors in technical namespaces, i.e., Mediawiki and Templates (Wikipedia contributors, ’Wikipedia:Namespace’, Wikipedia, The Free Encyclopedia, 25 January 2022, 02:37 UTC, https://en.wikipedia.org/w/index.php?title=Wikipedia:Namespace&oldid=1067769773 [accessed 19 February 2022]) broken down by year and by generation. Similarly, as with the previous measurement, here we focus on the very active technical editors who performed more than 100 edits in one month in technical namespaces (that is, templates and MediaWiki namespaces).
  • Very active coordinators: the number of very active editors in coordination namespaces, i.e., Wikipedia and Help at least one month, broken down by year and by generation. In this other case, we repeat exactly the same analysis but considering only those very active editors who edit in the Help and Wikipedia namespaces (this is, Wikiprojects, Village Pump, among others). The number of editors for this second indicator tends to be higher. In the graph, we see the number of “Very active project coordinators”.

3.2.5. Administrators

RQ5 [Administrators] How are the Wikipedia language communities granting admin user rights (e.g., roles such as sysops) to new members?

The fifth vital sign is admins. Admins have rights and responsibilities in performing actions over the content and take a key function for the community. We propose three different indicators:

  • Admins by year: number of admins by year of flag granted and by generation.
  • Admins by lustrum: total number of active admins by generation at the current month.
  • Admins ratio: ratio between the number of active admins and the number of active editors in a specific month.

3.2.6. Global participation

RQ6 [Global-Meta-wiki] How are Wikipedia language communities participating in global projects spaces (Meta)?

RQ7 [Global-Local] What is the composition of Wikipedia language communities in terms of multilingual editors?

The sixth and last vital sign is global. Participating in the “global community” (i.e., the Wikimedia Movement) is key for language communities to make their voice heard and learn from others. Complementarily, communities should also be able to attract members from other communities to edit. We propose two indicators based on the concept of primary editor, i.e., considering as an editor’s primary language edition the one in which they made more edits [39,43].

The two indicators are:

  • Meta-wiki participation: ratio between the number of active editors in Meta-wiki that have as primary a given language edition and the number of active editors in that Wikipedia language edition during the same month.
  • Primary language: this indicator aims at describing the composition of an editor community in terms of their primary language, looking at how many editors have this language edition as their primary one (primary editors), and for the remaining ones (non-primary editors), which is their primary language edition. Therefore, it consists in the distribution of the primary language edition of the editors contributing to a given language edition.

3.3. Code and Data

In order to compute the indicators for growth and renewal, we used the available data provided by the Wikimedia Foundation in the form of dumps (in particular, the Mediawiki History dump) (Mediawiki History Dump, Specifically, we downloaded the dumps that included all the history of interactions until November 2021. https://dumps.wikimedia. org/other/mediawiki_history [accessed 19 February 2022].) The code we deployed in Python3 is made available (Community Vital Signs GitHub repository, https://github.com/WikiCommunityHealth/community-vital-signs [accessed 19 February 2022]), as well as the resulting databases (https://vitalsigns.wmcloud.org/datasets/ [accessed 19 February 2022]). As far as the interactions with the community members, they were all held in the Wikimedia international conferences through video call during 2021.

4. Results

In this section, we dedicate a different subsection to each of the three objectives of the study. In Section 4.1, we explore the state of growth and decline of Wikipedia language editions. In Section 4.2, we present the results for the indicators to measure community renewal and answer each of the research questions. In Section 4.3, we discuss the feedback received by Wikimedians on the indicators.

4.1. Community Growth, Stagnation, and Decline

To pursue the objective [O1] of assessing growth/stagnation/decline patterns in Wikipedia communities, we inspected the temporal evolution of the number of active editors over time, comparing the trends obtained for different language editions and performing clustering to identify general trends. We computed the monthly number of active editors for each of the 308 Wikipedia language editions, and we focused on communities with a minimum of 100 active editors in August 2021; they are 52.

To be able to group communities exhibiting similar trends, we relied on the k-means clustering algorithm on the time series, and we used dynamic time warping to measure similarity between the temporal sequences focusing on the general pattern [40]. We ran the k-means algorithm with parameters σ = 6, and learning rate = 0.1; we obtained six clusters, shown in Figure 2.

We observe that many language editions of different size belong to the first cluster (cluster 0), characterized by a first phase of growth of about 7 years (which correspond to 84 months) and then a stagnation and decline period until stabilization with more or less accentuated oscillations around a lower number of editors. This roughly corresponds to the decline observed for English (included in cluster 0) and other major language editions since 2007 [14,15].

The second cluster (cluster 1) also includes some of the biggest language editions, including French, Japanese, and Spanish. The trend is similar to that of the previous cluster, with the difference that after the rise and peak, instead of decline, we observe a more stable stagnation pattern. We may observe a tendency to decline in the first years of stagnation and a smooth raise again in the last years.

The remaining clusters exhibit a different pattern, with a common tendency to keep growing, although at different rates after the initial rise. Cluster 2 includes smaller European language communities, characterized by stronger oscillations around a smooth growing trend in the second phase. Cluster 3 represents some Asian language communities that interestingly exhibit a decline/stagnation period followed by a strongly growing pattern. Communities in cluster 4 see a first rapid growth period (with different duration for different communities), followed by a less skewed but still growing trend. Finally, cluster 5 groups communities of different sizes together, characterized by a more or less stable growing trend.

Figure 2. Each cluster represents a group of language editions exhibiting a similar temporal trend, according to the execution of the k-means clustering algorithm. Time is expressed on the x-axis in the number of months since the creation of a language edition. Gray lines represent the time series of the individual language editions, and red lines represent the average over each cluster. The language editions belonging to each cluster are reported below, with background indicating the size range in the number of active editors.

Figure 3 shows more in detail the evolution of these timelines (number of monthly active editors over time) for the sample of 8 selected languages that we will consider from now on for presenting the results. Apart from more or less marked seasonality patterns (with peaks of lower activity in summer), the graphs show for English and German a clear pattern of rapid growth until 2007, and then slow decline until more or less stabilizing in recent years. These are indeed the major languages in cluster 0; Polish, also in cluster 0, presents a similar pattern, although decline starts a bit later and presents a smoother trend. We see the same rapid growth in the first years for the Italian Wikipedia, followed by a smooth decline and then a smooth growth. For the Arabic Wikipedia, we can observe more in detail the sustained growth pattern of cluster 5. Catalan exhibits a rapid growth initially, followed by a stable pattern, apart from seasonal oscillations. Finally, Afrikaans presents growth until 2009, then a mostly stable pattern until a smooth growth in recent years, while Swahili reaches a similar number of active editors, but growing primarily after 2017.

Figure 3. Evolution of the number of active editors over months per language edition for our sample of eight selected language editions.

4.2. Vital Signs

We present the results obtained for the vital signs introduced in the previous section for the eight selected language editions. First, we show and comment on the results for this set of languages for each indicator, then we propose a possible target value based on our observation of results from these and other language editions, and conversations held with several members from local communities.

4.2.1. Retention

Figure 4 is a dual-axis graph showing the number of registered editors by month (left axis, grey bars) and the retention rate (right axis, orange line). We see that the number of registered editors is stable or even increasing in some cases (e.g., Afrikaans, Swahili), while the retention rate is decreasing over time in all the cases shown. With respect to RQ1 [Retention], then, we see a generally decreasing trend in the ability of the communities to retain new editors, with more or less marked fluctuations, and with values in the last observed period that range from 0.64% of Afrikaans and 1.55% of Swahili, to 3.53% of Polish and 3.52% of Italian.

The retention rate has been declining for the eight communities considered over the last 10 years. This is a worrying trend since lower retention rates may lead to a stagnant or declining community.

[T1] Retention rate: we argue that a reasonable target could be set to a 3% retention rate to ensure there is renewal among editors, while it could be desirable to reach 5–7%. In general, communities should aim at reversing the declining trend in the retention rate.

Figure 4. Vital sign: Retention. Number of registered editors by month (left axis, grey bars) and retention rate over time (right axis, orange line) as the percentage of new editors who edit at least once 60 days after their first edit.

4.2.2. Stability

In Figure 5, we see the number of active editors on a monthly basis, broken down by the number of months in a row they have been editing, represented with different colors for different ranges. Grey represents editors for whom this is the first month, which can either mean that they had never edited before, e.g., it is their first month ever, or that this is the first month after a break from editing. For example, in the Polish Wikipedia, fresh editors in a given month are about 35–40% (in grey), while the percentage of editors that are active for a long period of time (>24 months) is around 20%. This is similar for other language editions, like German, where the percentages are 33–35% and 25%, respectively, or Italian with 40–45% and 15–18%. For younger, less established communities, like Arabic Wikipedia, these percentages are very different. The share of editors active for only 1 month is up to 66–72%, and the long term engaged editors are forming only 3–4.5%. With respect to RQ2 [Stability], then, we can summarize that more established and mature communities are engaging a greater share of editors for several consecutive months, instead in younger projects, we observe more volatility, and the active community is mostly composed of editors who are active for a few months in a row.

Figure 5. Vital sign: Stability. Monthly percentage of the fraction of editors who are active for a given stretch of time from 1 month (grey), 2 months (green), 3 to 5 months (yellow), 7 to 12 months (red), 1 to 2 years (purple) and more than 2 years (blue).

[T2] Fresh editors: We would argue setting a target of 30–40% of “fresh” editors. This may be desirable in order to have an influx of new energy and ideas. If higher than this, and especially over 60%, it may be an indicator of a lack of capacity to engage and stabilize the community. High percentages of fresh editors are only desirable when the number of active editors is growing.

[T3] Long-time editors: with regard to the target for editors engaged for a long time, i.e., active for more than 1 year, a target share of around 33%, given as the sum of the 13–24 and >24 bins, seems appropriate. This value is indicative of a solid community able to carry on with long-term Wikiprojects and activities.

4.2.3. Balance

In Figure 6, we see the composition of the “very active editors” every year by lustrum of the first edit for the selected language editions. Since the birth of Wikipedia in 2001, we have five lustra: 2001–2005, 2006–2010, 2011–2015, 2016–2020, and 2021–2025. We can consider these periods as “generations.” For the oldest and most-established projects, such as English, German, Italian or Catalan, these 5-year periods also translate relatively well into different phases of growth of these projects. Very active editors are defined as those who make at least 100 edits per month. They are a very valuable group of editors since they account for most of the registered editors’ edits (e.g., in the Italian Wikipedia, 85% of the registered editors’ edits every year are performed by very active editors).

We argue that all generations should contribute to this group of editors because this would bring a balance of perspectives to the project. The graph shows the yearly number of editors who have been very active for at least one month. As the years go by, we expect the number of very active editors from younger generations to take a bigger share of the total, while the older generations shrink gradually. The graphs show that, in most cases, the percentage of very active editors who started editing during 2001–2005 has undergone only minor variations in recent years. For example, in the Polish Wikipedia, this percentage was 11.51% in 2014 and 9.13% in 2020, which means it has only slightly decreased in six years.

The first generation seems to be especially strong in the German Wikipedia. In Figure 5, we observe the prominence of the second generation, made of users who registered between 2006 and 2010, that maintained a higher share than the following generation in all cases but for Arabic; the initial period, and in particular the 2006–2010 lustrum seem to represent a time in which a group of very active editors established in the community, and maintained this position until now, leaving limited space for the following generation of editors. This is particularly true for Afrikaans, Catalan, German and Polish, where this lustrum still represents over 30% of the very active editors. The last generation, made of editors registered between 2016 and 2020, reaches a share of over 80% for Swahili, 68% for Arabic, 56% for English and 50% for Italian, while for the others it has lower values; we see an especially low proportion, around 30%, for German and Catalan.

In conclusion, with respect to RQ3 [Balance], we observe a general tendency to create an established group of very active editors from older generations of editors, that leaves more or less space to newer generations in different language editions.

[T4] Last generation: We believe a growing share of the last generation until occupying between 30–40% may be reasonable for a language edition that is not in a growth phase—larger when it is. We considered every generation to be 5 years (a lustrum), so, as a rule of thumb, we suggest that the last generation occupies from 15 to 40% depending on the years which have passed since its beginning (1–5).

[T5] First generation: In addition, a share of editors of at least 5–15% from the first generation (typically 2001–2005) seems a desirable target as well. Although they might be at the end of their lifecycle and the growth may have occurred with the following generation (2006–2010). The share of every previous generation will inevitably decrease over time.

Figure 6. Vital sign: Balance. Evolution of the number of very active editors over the years, broken down by the lustrum of their first edit. An editor is considered very active in a given month if they have made at least 100 edits in that month; in the graph, data are accumulated by year so that each year includes all the editors who have been very active in at least a month of that year. The colors and percentages show the fraction of editors belonging to each generation (lustrum of the first edit) over the total number of very active editors each year .

4.2.4. Special Functions

In this section, we describe indicators for users with a special role in the community. We focus on technical editors and community coordinators. Each role will be described in the following sections. Most of the time, these roles are formally defined or recognized within Wikipedia, although we argue that they serve very important functions within the de facto governance of the project and are thus worth considering. A shortage of users that devote themselves to these special tasks poses a threat to the normal functioning of Wikipedia.

Very Active Technical Editors

Figure 7 shows the evolution of very active technical contributors, i.e., editors that contribute to namespaces Template and MediaWiki. Editors belonging to this group perform technical activities such as the creation and maintenance of templates, which are special pages that allow one to automate and standardize the way content is presented in Wikipedia articles, such as with infoboxes. It needs to be noted that this group comprises a much smaller number of the general population of very active editors; in fact, in most cases, even for established Wikipedia editions, this group comprises a few dozen editors. For example (in parentheses the number of very active technical editors versus the total very active editors): Catalan (15/268), German (59/3215), Italian (72/1556), and Polish (26/863). Only the English Wikipedia has a few hundred editors belonging to this group (657/17,341). In smaller projects the number of very technical editors can be very low; for the Afrikaans Wikipedia it is less than 5 and in the Swahili Wikipedia there have not been very active technical editors since 2014.

With respect to RQ4 [Special functions - technical editors], we see a general prominence of older generations, particularly the second generation (editors registered in 2006–2010). This pattern, already visible for active editors with RQ3, is even more marked for editors having special technical functions, pointing out a lack of renewal for this kind of profile. Community building on this group of contributors is highly encouraged. Given the scarcity of editors who engage in technical contributions, renewal and balance among generations are even more important. For this reason, it can be concerning that a community is overrelying on older editors, and it is not capable to attract technical editors from the younger generations, as in the case of Catalan.

[T6] Very active technical editors: We believe that having at least 20 users as a minimum number of “very active technical editors” seems desirable considering the different tasks (bots, templates, etc.).

[T7] Very active technical editors from the last generation: Given the usually low number of very active technical editors and the remarkable effort this role requires, it would be preferable that at least a consistent part of them were from newer generations; renewal is key. Therefore, we see a reasonable target of at least 30% very active technical editors from the last generation.

Very Active Coordinators

Figure 8 shows the evolution of the number of very active coordinators over the years. The metrics and the figure are similar to the previous one, with the difference that here the focus is on editors who are very active in the “Wikipedia” or “Help” namespace, instead of technical namespaces. Therefore, here we account for users who have an active role in coordinating projects and initiatives and can be essential for dynamizing the community.

We can see for most language editions a peak in the number of this kind of very active editors in rough correspondence with the growth peak in the initial phase, and then a general decrease until stabilizing around lower values. Furthermore, if we look at the composition of this group of users in terms of generations, we observe that again most of them tend to be from older generations; in particular, for most language editions, from the generation of users who registered between 2006 and 2010. An interesting exception is Arabic, a community that, as already observed above (e.g., in Figure 2) seems to be in another phase with respect to the other major language editions and still characterized by a growing pattern; therefore, it is not surprising that such somehow younger and growing community has a growing number of very active coordinators, especially from the last generation.

Figure 7. Vital sign: Special functions-very active technical editors. Evolution of the number of very active technical editors over the years, broken down by the lustrum of their first edit. A user is considered a very active technical editor in a given month if they have made at least 100 edits in namespaces Template and MediaWiki in that month; in the graph, data are accumulated by year so that each year includes all the users who have been very active technical editors for at least a month of that year. The colors show the number of very active technical editors belonging to each generation (lustrum of the first edit) each year.

With respect to RQ4 [Special functions-coordinators], we confirm for very active coordinators the same tendency observed for very active technical editors, with a remarkable dominance of older generations for this kind of specialized profile; in this case, we also observe a general decrease in the number of editors taking this role in most of the observed communities, after a peak in the initial years.

Figure 8. Vital sign: Special functions - very active coordinators. Evolution of the number of very active coordinators over the years, broken down by the lustrum of their first edit. An editor is considered a very active coordinator in a given month if they have made at least 100 edits in namespaces “Wikipedia” or “Help” in that month; in the graph, data are accumulated by year so that each year includes all the users who have been very active coordinators in at least a month of that year. The colors show the number of very active coordinators belonging to each generation (lustrum of the first edit) each year.

[T8] Very active coordinators: we believe that the number of coordinators (“very active editors” in Wikipedia namespace) should be at least 20, and always larger than the number of very active technical editors, since taking coordination activities is key to engaging editors into contests, Wikiprojects, among others.

[T9] Very active technical editors proportion: Furthermore, the proportion of coordinators should be a minimum of 5–15% of the very active editors to guarantee there are common initiatives that involve everyone. A low proportion of coordinators implies that very active editors are working in silos.

4.2.5. Administrators

In Figure 9, we can see data visualizations implementing the indicators related to administrators. On the larger subgraph on the left, we see the admins’ flags granted by year, and the colors represent the generation they belong to. On the right, two smaller subgraphs show the current picture as of August 2021: the number of admins by generation (middle subgraph), and the number of active administrators, with their proportion with respect to the overall number of active editors (right subgraph). This percentage is indicative in some way of the “load” each admin is carrying, given that their task is to patrol the production and act when necessary. The lower the percentage, the higher the load they take.

In the English Wikipedia, most administrators were nominated before 2011, and indeed more than half of the current administrators (as of August 2021) belong to the very first generation of editors, and if we consider also the second generation, we find that over 90% of the administrators registered before 2011. This points out a concentration of power [20,22] in the hands of older generations of editors, that are quite closed towards newer generations.

In the other language editions, we see different patterns, with a larger share of administrator flags being assigned in more recent years. However, if we look at consolidated communities like German, Italian, and Polish, we see that three out of four administrators are from the generations of editors registered before 2011, while in Catalan the percentage is even higher, with one only administrator from the 2011–2015 generation.

With respect to RQ5 [Administrators], we observe in general a strong predominance of older generations for this fundamental role, even more, marked than for special functions; in some cases, we also observe that administrator rights were mostly assigned in past years, and only to very few, or to no users, in recent years.

[T10] New admins by year: we believe that to guarantee openness to positions of responsibility and privilege, there should be new admins every year (e.g., at least 5% of the total number of admins). This may imply setting an expiry date for the role or a voluntary request to lower-activity admins to renounce the role.

[T11] Admins’ generation balance: the overall group of admins should be balanced in terms of the different generations in which they started editing; we suggest that the last generation occupies at least 10% at the end of a lustrum.

[T12] Admins over active editors: The proportion of admins among active editors should be from 1% to 5% to guarantee that admins do not carry an excessive workload, since, in the end, they revise other editors’ edits.

[T13] Number of admins: Any community should have a minimum of three admins, regardless of their size in active editors.

Figure 9. Vital sign: Administrators. Left: number of administrator rights (flags) granted by year, broken down by editor generation (lustrum of the first edit) represented with different colors. Middle: total number of admins by generation on the last month of available data (August 2021). Right: number of active admins, and percentage with respect to the number of active editors (in August 2021).

4.2.6. Global Participation

In this section, we focus on indicators about cross-project participation. We argue that cross-project participation is needed to foster a global community, and it has several added benefits: from content growth, because an article missing in one language edition can be translated from other languages, to community growth, because training or recruiting initiatives and events that work for a community can be transferred to others.

The first one describes participation in Meta-wiki-also known in short as Meta-used within the Wikimedia movement for coordination, documentation, planning and analysis. Whilst the main language is English, Meta pages are available also in other languages. Community members, in this context called Wikimedians, are usually editors of local Wikipedia communities. Thus, we characterize the participation to Meta-wiki by each Wikipedia language community. Even though participating in Meta-wiki may not have a direct impact on a community language edition, we argue that it serves an important function in the global Movement, and it is strategic in terms of knowledge sharing and governance.

The second indicator of cross-wiki is given by the proportion of editors from each language community whose primary project is another Wikipedia language edition. As said, the primary language of an editor is simply the one in which has made more edits. Receiving contributions from editors whose project is another Wikipedia language edition is an opportunity to grow content. These contributions can be significant for very small Wikipedia language edition.

Meta-Wiki Active Editors

In Figure 10, we can see the number of editors in the selected Wikipedia language editions in Septembers 2021 that are also active on Meta-wiki; the graph shows the absolute number of editors and the corresponding percentage with respect to the total active editors of the wiki of origin.

Figure 10. Vital sign: Meta-wiki participation. Absolute numbers and percentage of editors of each Wikipedia language that are also active editors on Meta-wiki.

With respect to RQ6 [Global-Meta-wiki], we generally see little participation in the global project spaces (Meta-wiki). The proportion of editors who edit Meta-wiki ranges between 0.5 and 10% of the total size of the active community of editors. For large communities like English, German, and Italian, the percentage tends to be around 1%, while for very small communities like Swahili and Afrikaans, whose number of active editors is not consolidated, the percentage may rise up to 5 or even 10%. This may be explained by the fact that, in these communities, the most engaged editors are more connected to the Movement than the average editor and, given the small community size, their proportion is higher. In general terms, we see that the proportions tend to remain stable over time, which means that contributing to Meta-wiki is usually not an occasional activity, but a stable one.

[T14] Meta-wiki participation: As far as the targets, we see the number of editors from a Wikipedia language edition community active in Meta-wiki should be around 1% of the active editors. A much lower value would imply that the language edition community is not participating enough in the global movement. This should be calibrated on the size of the existing language community, for which we suggest a higher percentage, even if it can prove more challenging. Each community should have some user that act as ambassadors on Meta wiki.

Primary and Non-Primary Editors

Figure 11 reports for each Wikipedia language edition the percentage of primary and non-primary editors of that language. Arabic, English, German, and Polish have very high percentages of primary editors (>90%), instead for Swahili the percentage is only 54%. This lower percentage could be attributed to the fact that Swahili Wikipedia is a smaller and less established project. Lower percentages of primary editors are sometimes an indicator that an important part of the speakers of that language are editors of another Wikipedia edition, of a language that is also spoken in the area, or that they perceive as higher-status (e.g., this could be the case for English, which is spoken in several of the territories where Afrikaans and Swahili are spoken).

Figure 11. Vital sign: Primary and non-primary language editors. Percentage of editors of each Wikipedia language edition that are primary editors for that language (in grey) and that are primary editors in other languages. We do not report the full legend as it would cover a high number of languages. The six largest visible non-primary languages by order and color are: English (blue), German (purple), Spanish (green), French (yellow), Russian (pink), and Chinese (turquoise).

With respect to RQ7 [Global-Local], we see that Wikipedia language communities tend to be mainly composed by their primary editors, with a proportion ranging from 60% to 95%. The percentage of non-primary editors is usually higher when the language community is smaller. The primary language editions of non-primary editors tend to correspond to languages that are geographically or culturally close. There is also the special case of the non-primary editors who have as primary language English; this tends to happen in all the other language editions. We assume that some editors prefer editing in the English Wikipedia due to its global scope, to the point of making it their primary language, even though their native language is another one.

[T15] Primary language editors: we believe that the proportion of primary language editors among the active editors in a language community should be at least 55% to guarantee that there are dedicated editors whose main project is that Wikipedia language edition. A percentage higher than 95% might imply that the community is not attracting collaborators from other communities. The rationale for both percentages is that we see both 50% and 100% as extremes, and we suggest some margin around these values.

4.3. Validation and Affiliates’ Feedback

We present the results obtained for the focus group we held in five Wikimedia conferences and at two affiliate gatherings. Over the sessions, we placed special emphasis in understanding aspects such as the importance of the metric, what would they change about the metric, which actor holds responsibility for improving its results, and the potential actions that would lead to such improvements. For a matter of privacy, we will not mention who specifically expressed each suggestion or comment.

Regarding retention, many of the participants in the different sessions considered the metric of great importance. Not only because it directly relates to the entire community’s potential for growth, but also because of its simple interpretation. The only caveat they found is in the period considered for an editor to be retained (60 days after the first edit).

Several affiliate members considered that they knew several scenarios in which it could be tricked, and they put the example of a university course using Wikipedia, where the professor usually enrolls all the students to create content as an exercise for two to three months, even this does not necessarily mean they have been retained. Even though this period was computed as a cut-off threshold based on the distribution of surviving editors, we decided to listen to the participants and include a larger period (3 months, 6 months, and 1 year), and the overall decline did not change. In any case, for a matter of clarity, we will include different thresholds in a future website dashboard.

Retention is possibly the most discussed metric prior to the creation of these Vital Signs, even though there are no public results. Generally, session participants considered that the editors in the community have an important role in directly or indirectly improving it, followed by affiliates and the Wikimedia Foundation. They also recognized the importance of the features designed by the WMF Growth Team, and overall, they believed that improvements in the user experience would have an impact. In addition, they recognized that a more organized approach to mentorship would also be a valuable tool to improve newcomer retention, given the complexity of tools, policies, and tasks a new editor has to learn about.

Regarding stability, participants were convinced of its importance. Even though they thought this would not be as determinant as to change the focus of the affiliate activities. In fact, to guarantee the engagement of the long-term editors as well as to increase that of those who have been editing for a few months is something affiliates work on through the organization of activities such as contests and Wikiprojects. Not surprisingly, they considered the affiliates the main actors who could take responsibility for ensuring results.

Some affiliates considered it very helpful when it comes to understand the state of consolidation of a community. Given that a high percentage of first-month editors is both a sign of instability, but could also be a sign of growth. On the other end, affiliates recognized the importance of a consolidated percentage of long-term engaged editors. In this sense, they speculated and largely discussed the effects of rewards or recognition in the prevention of editor drop-off, even though they could also compel them to retire after a “mission accomplished”.

Regarding balance, there was unanimous recognition to the importance of seeing the community composition in terms of tenure. Differently than retention, where the declining rates were somehow expected, in this case they saw the growing percentages of the latest generations (2016–2029) and renewal as a relief. As recognized by several affiliates, this metric provided a different angle to understand the risk of becoming a closed community with only old-time editors. However, being balanced is a midterm goal. For this reason, given that the implicit goals are equivalent to the previous indicators, the actions for a balanced, productive community were not different from those planned for retention and stability.

Regarding two special community functions, technical contributors and coordinators were regarded as very important. The metric was designed to be consistent with balance, given that they could see the number of very active editors performing certain tasks on a yearly time-frame. Many affiliates expressed their concerns and lack of success in renewing the technical contributors, which seemed to be from much earlier generations than the overall community. They thought that planning “recruiting spaces” such as hackathons and leadership programs, events in general, could be a way to attract and provide the necessary skills for new or short-term involved editors to take technical or coordination responsibilities. On this vital sign, the affiliates totally agreed on their role and responsibility in taking this task.

Regarding administrators, the sessions’ participants were sometimes divided in their importance. Most of them considered that seeing indicators such as the number of administrators by their generation was a valuable way to understand the renewal, and that the number of administrators could grow in some cases. However, some participants considered that this was a sensitive debate, given the great commitment they take. In the debate, they considered that this is an affaire in which the Wikimedia Foundation has nothing to do, and that affiliates need to be cautious in only encouraging the debate within the community. Even though participants admitted that the lack of renewal in administrators could be seen as power concentration, discussing limits to the time involved and other ways to encourage renewal were not easy to implement.

Lastly, as far as the global participation indicators, participants found it interesting to see the participation in Meta-wiki as well as the proportion of non-primary editors in their languages; however, they did not consider it as relevant as the other vital signs. In the first case, they acknowledge the importance of being present in the global conversations and contribute to Meta-wiki with knowledge. However, they considered that the Wikimedia Foundation is the actor who needs to take a more active role in dynamizing these exchanges rather than the editors or the affiliate members. To improve the participation in Meta-wiki, affiliates’ recognized the importance of communication through the Wikimedia channels. The fact that English is mostly the de facto language of Meta-wiki was seen as a barrier to mitigate. In the second case, the proportion of non-primary editors among the active editors indicated a capacity of attraction of other languages’ editors, but also showed that some very small communities are even smaller since a considerable part of their active editors is not primarily committed to them. Tables 1 and 2 present a summary of all Vital Signs indicators together with the main findings of our assessment of the eight language communities we considered (Afrikaans, Arabic, Catalan, English, German, Italian, Polish, and Swahili) and the value we propose as targets to maintain a healthy community.

Table 1. A summary of RQ1 through RQ4 together with the Vital Signs we have proposed: Retention, Stability, Balance, and Special Functions. The table reports the definition of each indicator, together with a summary of the main finding from the eight communities we considered and the target values we propose to maintain a healthy community.

Research Question Vital Sign Indicator Summary of Findings Target
How do Wikipedia communities retain new members over time? Retention Retention rate: percentage of new editors who edit at least once 60 days after their first edit. The retention rate is declining in every community considered, regardless of size or growth pattern. [T1]—Retention rate: at least 3% to ensure renewal among editors; desirable to reach 5–7%.
How stable is the composition of Wikipedia communities over time? Stability Stability: number of active editors by the number of months they have been active in a row. More established communities are able to engage a larger share of their editors for a long period of time (one year or more), while smaller communities show more volatility. [T2]—Fresh editors: 30–40% of editors who where not active in the previous month.

[T3]—Long-time editors: 33% of long-time editors (>12 months).

How balanced are Wikipedia communities in terms of including new members over time and maintaining the old ones? Balance Balance: number and percentage of very active editors by year and by generation (lustrum of the first edit). General tendency to create an established group of very active editors from older generations of editors, that leaves more or less space to newer generations in different language editions. [T4]—Last generation: A growing share of the last generation until occupying between 30–40% at the end of a lustrum.

[T5]—First generation: At least 5–15% from the first generation (typically 2001-2005).

How are Wikipedia communities renewing their technical editors and project specialists? Special functions Technical editors: number of very active editors in technical namespaces (i.e., editors who performed more than 100 edits in one month in namespaces Mediawiki and Templates), broken down by year and by generation.
Coordinators: number of very active editors in coordination namespaces (i.e., editors who performed more than 100 edits in one month in namespaces Wikipedia and Help), broken down by year and by generation.
General prominence of older generations, particularly the second generation (editors registered in 2006-2010). A community should not over-rely on older editors, and should attract technical editors from the younger generations.
Like for very active technical editors, remarkable dominance of older generations for this kind of specialized profile.
[T6]—Very active technical editors: at least 20 users as a minimum number of very active technical editors.
[T7]–Very active technical editors from the last generation: at least 30% very active technical editors from the last generation.
[T8]—Very active coordinators: The group of very active coordinators should always be larger than the number of very active technical editors. [T9]—Very active coordinators proportion: there should be a minimum of 5-15% of the very active editors to guarantee there are common initiatives that involve everyone.
Table 2. A summary of RQ5 through RQ7 together with the Vital Signs we have proposed: Admins, and Global Participation, and Special Functions. The table reports the definition of each indicator, together with a summary of the main finding from the 8 communities we considered and the target values we propose to maintain a healthy community.
Research Question Vital Sign Indicator Summary of Findings Target
How are the Wikipedia communities granting admin user rights to new members? Admins Admins by year: number of admins by year of flag granted and by generation. Admins by lustrum: total number of active admins by generation at the current month. Admins by lustrum: the second one is the total number of active admins by generation at the current month. General strong predominance of older generations for this fundamental role, even more marked than for special functions. In some cases, administrator rights were mostly assigned in past years, and only to very few, or to no users, in recent years. [T10]—New admins by year: there should be at least 5% of new admins every year.
[T11] – Admins’ generation balance: the last generation occupies at least 10% at the end of a lustrum.
[T12]—Admins over active editors: The proportion of admins among active editors should be from 1% to 5% to avoid excessive workload.
[T13]—Number of admins: any community should have a minimum of 3 admins, regardless of their size in active editors.
How are Wikipedia language communities participating in global projects spaces (Meta-wiki)? Global Participation Meta-wiki participation: ratio between the number of active editors in Meta-wiki that have as primary a given language edition, divided by the number of active editors in that Wikipedia language edition during the same month. The proportion of editors who edit Meta-wiki ranges between 0.5% and 10% of the active editors from each community. For large communities it tends to be around 1%. [T14]—Meta-wiki participation: The proportion of editors from a Wikipedia language edition community active in Meta-wiki should be around 1% of the active editors.
What is the composition of Wikipedia language communities in terms of multilingual editors? Primary language: distribution of the primary language edition of the editors contributing to a given language edition. The percentage of primary editors ranges between 60% and 95%. The percentage of non-primary editors is usually higher when the language community is smaller and non-primary editors correspond to languages that are geographically or culturally close. [T15]—Primary language editors: the proportion of primary language editors should be at least 55%, and not higher than 95%.
5. Discussion

5.1. Main Results across Communities

The first objective [O1] of this paper was to assess the growth, stagnation, and decline patterns in Wikipedia language communities. Contrary to what was generally assumed from previous research, which was mainly focused on English and other major language editions [5,14,16], we found that not all Wikipedia language editions are in stagnation or decline. We have seen notable exceptions: communities that are still growing, such as Arabic or Chinese among the large language editions, and Hebrew, Greek, and Turkish, among the middle ones. Taking into account the 50 largest communities, results show that only half of them stagnate or decline; major language communities mostly tend to stagnation or decline, while middle and smaller ones are mostly still in the process of growing larger.

We can hypothesize a variety of factors influencing these different patterns, related to external factors, e.g., Internet access, geopolitical context, number and demographic composition of a language’s speakers, language status (official language or not); and to internal factors, i.e., calcification of policies [7], community dynamics [22,44], conflict [45,46], community identification [47], platform usability [26], among others. While in this study we focused on investigating the state of the active community, developing simple indicators for capturing these internal and external factors would be an interesting approach to explain how they may specifically affect the growth, stagnation and decline patterns of different language communities.

We can hypothesize a variety of factors influencing these different patterns, related to external factors, e.g., Internet access, geopolitical context, amount and demographic composition of a language’s speakers, language status (official language or not); and to internal factors, i.e., calcification of policies, community dynamics, technical complexity, platform usability, among others. While in this study we focused on investigating the state of the active community, developing simple indicators for capturing these internal and external factors would be an interesting approach to explain how they may specifically affect the growth, stagnation and decline patterns.

The second objective [O2] was to create a set of indicators to capture different aspects of growth and renewal within communities. We introduced the Vital Signs as a framework to measure and monitor the community composition, both in terms of the entire group of editors and of those dedicated to special functions (technical aspects and coordination, beyond adminship). Thus, through indicators such as renewal, we can study the degree of success in terms of getting new editors to settle in the community, and others, like stability or balance, tell us also about the capacity of the community to engage its members over time. These indicators were designed while taking into account usability and ease of interpretation. By being consistent in the definition and selection of certain variables (e.g., time-frames, very active editor, etc.), we tried to ease their interpretation.

For example, choosing “very active editors” for the study of community balance in terms of different generations was a practical choice aimed at understanding the most productive part of the community (in fact, responsible for about 85% of the registered editors’ edits). While this is necessarily a limited view of the community, at the same time it gives the idea of the state of renewal among the most productive group. In an analogous way, studying the time in which flags were granted to administrators is also an exercise of simplicity; even though there exist other user flags, administrators constitute the largest group of editors with considerable rights and responsibilities [37].

Vital Signs depict a similar situation across communities following the same patterns (between decline and stagnation: Catalan, English, German, Italian, and Polish; growing: Arabic, Swahili and Afrikaans). Among the former group, we have seen that communities show signs of balance in terms of the different generations, thus community renewal is occurring. At the same time, on a monthly basis, a 15–30% of the active editors has been editing for more than 24 months in a row. In contrast, in the communities from the latter group, the group of very active editors is composed mainly (over 60%) by members of the generation who started editing between 2016 and 2020. This is consistent with the stability indicator, which shows that, for 40–75% of the monthly active editors, this was their first month of editing.

On the other hand, the contrast between the two types of communities is clear, especially with respect to these two indicators, balance and stability. The three communities from our sample that are growing have a difficulty in engaging their editors on the longterm. However, at the same time, their retention rate is also declining, which means that they may not be entirely unaffected by the factors that matter for helping editors survive their first weeks. On the other hand, we see that for all the communities analyzed the proportion of newer generations in specialized functions (technical editors and coordinators) is smaller than in the overall group of very active editors. This means that these core functions of the community present a higher risk in terms of lack of renewal.

Likewise, admins who have a strong voice and an implicit influence on the community discussions showed signs of lack of renewal. Even though some languages continue granting flags, they tend to give them to editors who started editing in the first ten years from the creation of Wikipedia, and the overall distribution of generations is very skewed towards old-time editors. We believe that this delivers a message of closure to any potential new candidate. In the open debate on power concentration in the Wikipedia community [20,22], we believe that ensuring pathways are open to new editors to become administrators might make the project less oligarchic and help tearing down the barriers that block newcomers.

The third and last objective [O3] of this paper was to validate the indicators with results from a sample of language communities and explore their potential role in affiliate planning. We approached the communities in three conferences and two dedicated sessions in which we asked them the same questions. While there were some discrepancies on the value of indicators for administrators and global community, the rest of them received full support. In particular, Wikimedia Poland (with whom we actually started the Vital Signs, after a conversation on their needs for monitoring the state of the community) received them enthusiastically and have already started thinking about how to develop some community actions to improve on retention and special functions, based on our results. We employed a focus group approach because at the moment there is no prototype of a dashboard in which to allow a constant interaction with the Vital Signs. The sessions were useful in order to collect new ideas and to perceive the general attitude towards the metrics.

We validated the specific configurations in terms of the key variables for the metrics (e.g., the definition “very active editors” and time-frames) and graph types. Some requests on having more granularity on some metrics were collected for the dashboard implementation. These indicators have been computed locally on a server for the purpose of this study, but we foresee the implementation in a website architecture on the form of an interactive dashboard: https://vitalsigns.wmcloud.org/ [accessed 22 February 2022]. This will allow us to make our results available in real time for all the existing communities, and to collect their feedback in a more structured way.

5.2. Setting Growth-Based Goals (Affiliates and WMF)

Once Vital Signs indicators have been validated by the affiliates, we should discuss the possibility of using them to set growth-based goals and use them as a baseline to progress towards specific targets. In the focus group sessions with relevant members from the communities, we found out that they considered that improving on these indicators was mostly a task for affiliates, especially in the current moment, when the Wikimedia Strategy 2030 (Meta contributors, ’Strategy/Wikimedia movement/2018-20/Recommendations’, Meta, discussion about Wikimedia projects, 16 August 2021, 09:29 UTC, https://meta.wikimedia.org/w/index.php?title=Strategy/Wikimedia_movement/2018-20/Recommendations&oldid=21885801 [accessed 19 February 2022]) discourse has set community growth and inclusiveness as a priority, and the Wikimedia Foundation is developing many interface improvements in the Growth team.

Affiliates can play a key role in terms of leading the discussions on performing the necessary changes to drive the growth of the communities. Mainly because of two reasons: they are organized structures with strong communication capacities, and they are formed by community members who have an incidence in the on-wiki decision-making spaces. Not only it is possible that they send the message to their members through the communication channels (e.g., mailing list), but they can also coordinate with other affiliates to do the same. Affiliates are organized and set plans with objectives, they have the advantage of being able to focus some resources, but also count on the energy of their members who volunteer in their activities.

In their annual plans, affiliates usually set indicators to measure the amount of content (e.g., number of articles, their size, number of images), rather than the number of newcomers retained or number of active editors. To start, this may require a cultural shift—focusing on people and not only on content. Additionally, there is a problem of accountability, e.g., improving on retention requires changes in the technology usability as well as in the community behaviour towards newcomers. Since the first depends on the developments proposed by the Wikimedia Foundation and the latter on the actual editors, this could dissuade affiliates from taking such commitment. On the contrary, rather than letting each stakeholder loose, they are in the best place to coordinate these efforts and create joint strategies.

In some cases, this may appear even as a more daunting task for affiliates, especially for languages spread across multiple countries (e.g., the German Wikipedia is supported by Wikimedia affiliates located in Germany, Austria and Switzerland), which would require stronger efforts on planning and impact assessment. The actions could range from creating mentorship programs aimed at accompanying newcomers through the first edits to giving rewards to experienced editors or starting a general update on the policies to improve the readability.

However, to date, we believe the most important barrier is that of not being able to understand how to measure success and set reasonable targets. This study makes available some fine-grained indicators on active editors, and enables affiliates to set growth-based goals and KPI targets. We have suggested some specific targets in order to facilitate the task to affiliates. Even though we acknowledge that growth should be coordinated across all stakeholders and supported for every community, we must recognize that each of them has its own idiosyncrasy, and therefore they should revise the targets based on their specific situation, capacities and aspirations.

5.3. Limitations of This Study and Future Steps

We see several rich directions for future research. First, although we have demonstrated that decline and stagnation are not generalized among Wikipedia language communities, we have seen that a majority of the large ones follow this trend. Previous literature already showed that peer-projects often find their peak and then decline in number of members due to a process of calcification of its inner structures [5,7]. Languages like Arabic or Chinese present the characteristics of large language editions, but show patterns of thriving communities, and at some point they might find a peak. We argue that it is necessary to further analyze growth, stagnation and decline patterns along with internal factors such as community dynamics, and external factors such as the social, political, economic, geographical and demographic context of each language.

With the Vital Signs, we presented a set of simple metrics aimed at easy use and dissemination among community members in order to encourage the deployment of strategies to grow. Future studies could go deeper on the analysis of the relationship between the active community size, the registered editors, the retained editors, and the editors who abandoned the project in order to properly explain growth. We have seen that the number of registered editors tends to be stable on a monthly basis. Therefore, we can consider that unless there are spikes of editors leaving the community, increasing the retention rate will lead to community growth. We believe it is important to keep in mind the distinction between the two concepts, and at the same time encourage a further study on the causes of retention and constantly evaluate strategies that might increase its rate.

Other dimensions of the editing experience that should also be analyzed in relation to a lack of retention are discussion patterns, toxic language, mentorship, among others. We believe that having an indicator for each of them could also be valuable in order to complement the Vital Signs. However, we must also acknowledge that we purposely made an effort in reducing the number of indicators to six Vital Signs, so that they can be easily managed and remembered. While there could be more Vital Signs, each of them could also include more in-depth indicators. Additional Vital Signs could include a dropoff rate, to measure the number of consolidated editors who stopped editing in a given year [48]. The number of active administrators could be complemented by the number of administrative actions. In addition, it would also be interesting to show whether the editors that are participating across multiple projects are also those who are very active, take special functions or have an admin flag. In this sense, the concentration of functions and ultimately skills has not been measured.

Finally, we must acknowledge the limitations of the approach followed to collect feedback from stakeholders. While the validation of the Vital Signs would have benefit from a more extensive survey, for example using quantitative methods, we decided to present in multiple Wikimedia conferences and have qualitative insights as we believed it fitted the context. In Wikimedia, every development needs to happen according to the values of transparency and inclusivity, and thus, it requires having multiple presentations so that the heterogeneous set of communities can learn about it and give their opinion. The feedback we collected was rich, and enabled improving on the metrics and their visualizations.

Interactive Dashboards Website

The positive and in several cases enthusiastic feedback received also convinced us to start the design of an interactive dashboard and convert the metrics into a tool. As this is still work in progress, we show in Figure 12 two screenshots of the implementation of this tool for the indicators retention and balance, obtained using Plotly (Plotly, https://plotly.com [accessed 19 February 2022]). for data visualization. In Addition to the website, we are also working on the creation of a bot to update data tables on Meta-wiki, one for each Vital Sign. These table will allow editors to follow specific indicators and use them to sort and compare Wikipedia language editions, in the same way they do for example in the List of Wikipedia by number of articles (Meta contributors, ’List of Wikipedias’, Meta, discussion about Wikimedia projects, 1 January 2022, 12:46 UTC, https://meta.wikimedia.org/w/index.php?title=List_of_Wikipedias&oldid=22522344 [accessed 19 February 2022]). Making the Vital Signs present in Meta-wiki closes the loop and ensures that Wikipedians can find the indicators at the places where they usually search for Movement-related information. The graphs and tables available on these two spaces will be updated on a monthly basis as new data are available.

Figure 12. Prototype dashboards for the Vital Signs website. Top: Retention in the English Wikipedia, including the number of registered editors by month (grey bars) and the retention rate (orange line). Bottom: Balance in the English Wikipedia, showing the number of very active editors by lustrum of their first edit (stacked bars).

6. Conclusions

We found that even when Wikipedia language communities are in stagnation or a slight decline, they renew a significant part of their most productive editors. This implies that communities are more sustainable than expected, and robust to a situation of sudden departure of tenured members. At the same time, the results show that, among the largest 50 communities in number of monthly active editors, only half of them exhibit a pattern of decline or stagnation. This also represents a significant breakthrough, given that it was widely assumed that communities were in decline for not being able to maintain their number of active editors [16]-possibly because of a focus on the English Wikipedia and other major language editions and a lack of analysis of smaller or younger communities.

Some authors found that when peer projects mature, it is usual that their policy structure and tools are optimized for content quality control, and this tends to imply the rejection of newcomers’ contributions [5,7,16]. However, other causes such as reverts [29] as well as the usability of the site [26] are also relevant to editor retention. While these factors are tackled by some projects supported by the Wikimedia Foundation, there is no specific coordination between the different Wikimedia Movement organizations (i.e., WMF and affiliates) to ensure that these actions revert the declining retention rate. At the same time, at community level, there are no processes set to revise the help pages and simplify their language, encourage interface improvement, or any other change that would demonstrably help newcomers survive the first days.

We believe that coordination between the organized and non-organized structures of the Wikimedia movement, i.e., on the one hand, Wikimedia Foundation and Wikimedia affiliates, and communities, on the other hand is essential to growing larger and more diverse communities. However, we acknowledge its difficulty because of the myriad of factors to be addressed, as well as the lack of accountability to this purpose. This work provides a set of indicators to raise awareness and set a common baseline. We have also provided several target values to ease the indicators’ interpretation and set measurable goals on a mid-term and long-term basis. We encourage monitoring these indicators and those factors that may cause their improvement. The six Vital Signs should be seen as the final outcome in terms of community renewal and growth. Still, the quality of the atmosphere in terms of non-toxic language, editor satisfaction with the tools, among many other indicators, can be complementary to reach the same goals.

The indicators we have introduced are grounded on academic literature and community practices and based on the input and feedback received from Wikimedia communities. While in this paper we have only shown results for a sample of eight communities, used to validate the indicators with affiliates, we are working to create a dashboard that provides regular updates for all the more than three hundred existing Wikipedia language editions. We believe this will represent a significant improvement in the possibilities for community self-awareness and monitoring, which until now could only rely on basic statistics on the overall number of active users; with respect to these simple metrics, the Vital Signs allow for a much richer inspection of community composition and evolution.

We encourage monitoring these indicators along with those factors that may cause their improvement. The six Vital Signs should be seen as the final outcome in terms of community renewal and growth. Still, the quality of the atmosphere in terms of nontoxic language, editor satisfaction with the tools, among many other indicators, can be complementary to reach the same goals.

Affiliates recognize the role they have in pushing the necessary changes for renewal/growth, as they are constituted by community members. Having detailed community-oriented indicators can encourage them to set growth targets and complement the content-creation existing ones that are typical in the affiliate annual plans.

Renewal may be considered a positive aspect conducive to a sustainable community. We believe that communities can grow larger given the number of new monthly registered editors. While some communities are in the process of renewing, others are rather struggling for the initial growth-more than 250 Wikipedia language editions have less than 100 active editors per month. There are language projects which have not been consolidated. We believe that by addressing the factors that affect the retention rate, it may be possible to head towards the path of community renewal and growth.

Setting the goal of renewing and growing communities and tailoring different actions among Wikimedia actors, while tracking the indicators, can make Wikipedia a selfregulating community, larger and more vibrant than it is now. The “wisdom of the crowds,” that allowed Wikipedia to be so effective in creating free content to the surprise of everyone, may also show how Wikipedia can change its structures to stay inclusive to newcomers, and lead the way to other online communities to do so.

Author Contributions: Conceptualization, M.M.-R.; Data curation, M.M.-R.; Formal analysis, M.M.R.; Funding acquisition, M.M.-R., C.C. and D.L.; Investigation, M.M.-R., C.C. and D.L.; Methodology, M.M.-R.; Project administration, M.M.-R., C.C. and D.L.; Resources, M.M.-R. and C.C.; Software, M.M.-R.; Supervision, D.L.; Validation, M.M.-R., C.C. and David Laniado; Writing—original draft, M.M.-R.; Writing—review & editing, M.M.-R., C.C. and D.L. All authors have read and agreed to the published version of the manuscript.

Funding: This research was funded by the Wikimedia Foundation grant “Community Health Metrics: Understanding Editor Drop-off.” (https://meta.wikimedia.org/wiki/Grants:Project/Eurecat/Community_Health_Metrics:_Understanding_Editor_Drop-off) [accessed 22 February 2022].

Data Availability Statement: All code is available through GitHub at: https://github.com/WikiCommunityHealth/ accessed on 22 February 2022. Data are publicly available through at: https://dumps.wikimedia.org/ accessed on 22 February 2022.

Acknowledgments: We want to thank Paolo Aliprandi for his precious work on the development of the dashboards, that we hope will help to take this work one step forward and make of it a useful tool available online to all the Wikipedia communities. We also acknowledge Wikimedia Poland, Wikimedia Italy and Amical Wikimedia for the rich conversations that guided us to shape this work, and the organizers of Wikiindaba, Wikimedia CEE and WikiArabia for giving us the opportunity to present our work and receive feedback from the community members. In particular, thank you Natalia Szafran-Kozakowska, Wojciech Pedzich, Claudi Balaguer, Mohammed Bachounda, Kiril Simeonovski, Marta Arosio, and Anisa Kuci.

Conflicts of Interest: The authors declare no conflict of interest.

References

  1. Preece, J.; Maloney-Krichmar, D.; Abras, C. History of online communities. Encycl. Community 2003, 3, 86.
  2. Heckathorn, D.D. Collective action and group heterogeneity: Voluntary provision versus selective incentives. Am. Sociol. Rev. 1993, 58, 329–350. [CrossRef]
  3. Marwell, G.; Oliver, P. The Critical Mass in Collective Action; Cambridge University Press: Cambridge, UK, 1993.
  4. Prasarnphanich, P.; Wagner, C. Explaining the sustainability of digital ecosystems based on the wiki model through critical-mass theory. IEEE Trans. Ind. Electron. 2009, 58, 2065–2072. [CrossRef]
  5. Halfaker, A.; Geiger, R.S.; Morgan, J.T.; Riedl, J. The rise and decline of an open collaboration system: How Wikipedia’s reaction to popularity is causing its decline. Am. Behav. Sci. 2013, 57, 664–688. [CrossRef]
  6. Jemielniak, D. Common Knowledge? An Ethnography of Wikipedia; Stanford University Press: Stanford, CA, USA, 2014. [CrossRef]
  7. TeBlunthuis, N.; Shaw, A.; Hill, B.M. Revisiting“ The rise and decline” in a population of peer production projects. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; pp. 1–7.
  8. McMahon, C.; Johnson, I.; Hecht, B. The substantial interdependence of Wikipedia and Google: A case study on the relationship between peer production communities and information technologies. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media, Montreal, QC, Canada, 15–18 May 2017.
  9. Miquel Ribé, M.; Vaidla, K.; Fort, F.; Torres, A. Estratègia del moviment Wikimedia 2030: Com un procés d’estratègia oberta inclusiva ha situat la gent al centre. BiD 2021, 2, 47.
  10. Flöck, F.; Laniado, D.; Stadthaus, F.; Acosta, M. Towards Better Visual Tools for Exploring Wikipedia Article Development—The Use Case of “Gamergate Controversy”. In Proceedings of the Ninth International AAAI Conference on Web and Social Media, Oxford, UK, 26–29 May 2015.
  11. Pentzold, C.; Weltevrede, E.; Mauri, M.; Laniado, D.; Kaltenbrunner, A.; Borra, E. Digging Wikipedia: The online encyclopedia as a digital cultural heritage gateway and site. J. Comput. Cult. Herit. 2017, 10, 1–19. [CrossRef]
  12. Bao, P.; Hecht, B.; Carton, S.; Quaderi, M.; Horn, M.; Gergle, D. Omnipedia: bridging the wikipedia language gap. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Austin, TX, USA, 5–10 May 2012; pp. 1075–1084.
  13. Miquel-Ribé, M.; Laniado, D. The Wikipedia Diversity Observatory: helping communities to bridge content gaps through interactive interfaces. J. Int. Serv. Appl. 2021, 12, 1–25. [CrossRef]
  14. Suh, B.; Convertino, G.; Chi, E.H.; Pirolli, P. The singularity is not near: slowing growth of Wikipedia. In Proceedings of the 5th International Symposium on Wikis and Open Collaboration, Orlando, FL, USA, 25–27 October 2009; pp. 1–10.
  15. Ortega Soto, J.F. Wikipedia: A Quantitative Analysis. Doctoral dissertation, Universidad Rey Juan Carlos, Madrid, 2009. Available online: https://burjcdigital.urjc.es/bitstream/handle/10115/11239/thesis-jfelipe.pdf (accessed on 4 April 2022)
  16. Hill, B.M.; Shaw, A. Wikipedia and the End of Open Collaboration. Wikipedia 2020, 3, 20.
  17. Asatani, K.; Toriumi, F.; Ohashi, H. Rise and decline process of online communities: modeling social balance of participants. In Proceedings of the Social Simulation Conference, Barcelona, Spain, 1–5 September 2014.
  18. Gorbatai, A.D. The Paradox of Novice Contributions to Collective Production: Evidence from Wikipedia. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1949327 (accessed on 10 February 2014).
  19. Butler, B.; Joyce, E.; Pike, J. Don’t look now, but we’ve created a bureaucracy: The nature and roles of policies and rules in wikipedia. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA, 5–10 April 2008; pp. 1101–1110.
  20. Shaw, A.; Hill, B.M. Laboratories of oligarchy? How the iron law extends to peer production. J. Commun. 2014, 64, 215–238. [CrossRef]
  21. Konieczny, P. Governance, organization, and democracy on the Internet: The iron law and the evolution of Wikipedia. In Sociological Forum; Wiley Online Library: Hoboken, NJ, USA, 2009; Volume 24, pp. 162–192.
  22. Rijshouwer, E.; Uitermark, J.; de Koster, W. Wikipedia: A self-organizing bureaucracy. Inf. Commun. Soc. 2021, 33, 1–18. [CrossRef]
  23. Heaberlin, B.; DeDeo, S. The evolution of Wikipedia’s norm network. Future Internet 2016, 8, 14. [CrossRef]
  24. Raitman, R.; Augar, N.; Zhou, W. Employing wikis for online collaboration in the e-learning environment: Case study. In Proceedings of the Third International Conference on Information Technology and Applications (ICITA’05), Sydney, Australia, 4–7 July 2005; Volume 2, pp. 142–146.
  25. Ebner, M.; Kickmeier-Rust, M.; HolZInGER, A. Utilizing Wiki-Systems in higher education classes: a chance for universal access? Univers. Access Inf. Soc. 2008, 7, 199–207. [CrossRef]
  26. Gluza, W.; Turaj, I.; Meier, F. Wikipedia Edit-a-thons and Editor Experience: Lessons from a Participatory Observation. In Proceedings of the 17th International Symposium on Open Collaboration, Online, 15–17 September 2021; pp. 1–9.
  27. Hargittai, E.; Shaw, A. Mind the skills gap: the role of Internet know-how and gender in differentiated contributions to Wikipedia. Inf. Commun. Soc. 2015, 18, 424–442. [CrossRef]
  28. Cowan, B.R. Causal Effects of Wiki Site Design on Anxiety and Usability. Doctoral dissertation, University of Edinburgh, Edinburgh, Scotland, 2011. Available online: https://era.ed.ac.uk/handle/1842/9703 (accessed on 4 April 2022).
  29. Halfaker, A.; Kittur, A.; Riedl, J. Don’t bite the newbies: how reverts affect the quantity and quality of Wikipedia work. In Proceedings of the 7th International Symposium on Wikis and Open Collaboration, Mountain View, CA, USA, 3–5 October 2011; pp. 163–172.
  30. Morgan, J.T.; Bouterse, S.; Walls, H.; Stierch, S. Tea and sympathy: Crafting positive new user experiences on wikipedia. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work, San Antonio, CA, USA, 23–27 February 2013; pp. 839–848.
  31. Morgan, J.T.; Halfaker, A. Evaluating the Impact of the Wikipedia Teahouse on Newcomer Retention. In Proceedings of the 14th International Symposium on Open Collaboration, Paris, France, 19 April 2018; pp. 1–7.
  32. Ducheneaut, N. Socialization in an open source software community: A socio-technical analysis. Comput. Support. Coop. Work 2005, 14, 323–368. [CrossRef]
  33. Lehmann, J.; Lalmas, M.; Yom-Tov, E.; Dupret, G. Models of user engagement. In Proceedings of the International Conference on User Modeling, Adaptation, and Personalization, Montreal, QC, Canada, 16–20 July 2012; Springer: Berlin, Germany, 2012; pp. 164–175.
  34. Halfaker, A.; Riedl, J. Bots and cyborgs: Wikipedia’s immune system. Computer 2012, 45, 79–82. [CrossRef]
  35. Geiger, R.S. The lives of bots. arXiv 2018, arXiv:1810.09590.
  36. Zheng, L.; Albano, C.M.; Vora, N.M.; Mai, F.; Nickerson, J.V. The roles bots play in Wikipedia. Proc. ACM Hum. Comput. Interact. 2019, 3, 1–20. [CrossRef]
  37. Arazy, O.; Ortega, F.; Nov, O.; Yeo, L.; Balila, A. Functional roles and career paths in Wikipedia. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, Vancouver, BC, Canada, 14–18 March 2015; pp. 1092–1105.
  38. Kreider, C.; Kordzadeh, N. Request for Adminship (RFA) within Wikipedia: How Do User Contributions Instill Community Trust? SAIS 2015 Proc. 2015, 1, 4.
  39. Hale, S.A. Multilinguals and Wikipedia editing. In Proceedings of the 2014 ACM Conference on Web Science, Bloomington, IN, USA, 23–26 June 2014; pp. 99–108.
  40. Petitjean, F.; Ketterlin, A.; Gançarski, P. A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit. 2011, 44, 678–693. [CrossRef]
  41. Rabiee, F. Focus-group interview and data analysis. Proc. Nutr. Soc. 2004, 63, 655–660. [CrossRef]
  42. Albert, B.; Tullis, T. Measuring the User Experience: Collecting, Analyzing, and Presenting Usability Metrics; Newnes: Oxford, UK, 2013.
  43. Miquel Ribé, M. Identity-Based Motivation in Digital Engagement: The Influence of Community and Cultural Identity on Participation in Wikipedia. Ph.D. Thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2017.
  44. Kittur, A.; Chi, E.; Pendleton, B.A.; Suh, B.; Mytkowicz, T. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. World Wide Web 2007, 1, 19.
  45. Collier, B.; Bear, J. Conflict, confidence, or criticism: An empirical examination of the gender gap in Wikipedia. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work, Seattle, WA, USA, 11–15 February 2012.
  46. Iosub, D.; Laniado, D.; Castillo, C.; Fuster Morell, M.; Kaltenbrunner, A. Emotions under discussion: Gender, status and communication in online collaboration. PLoS ONE 2014, 9, e104880. [CrossRef] [PubMed]
  47. Neff, J.J.; Laniado, D.; Kappler, K.E.; Volkovich, Y.; Aragón, P.; Kaltenbrunner, A. Jointly they edit: Examining the impact of community identification on political interaction in wikipedia. PLoS ONE 2013, 8, e60584. [CrossRef] [PubMed]
  48. Miquel-Ribé, M.; Consonni, C.; Laniado, D. Wikipedia Editor Drop-Off: A Framework to Characterize Editors’ Inactivity. In Proceedings of the Wiki Workshop 2021 at The Web Conference 2021 (Wiki Workshop 2021), Online, 14 April 2021.

This work is released under the Creative Commons Attribution 4.0 International license, which allows free use, distribution, and creation of derivatives, so long as the license is unchanged and clearly noted, and the original author is attributed.

Public domainPublic domainfalsefalse