Assessing the accuracy and quality of Wikipedia entries compared to popular online encyclopaedias/Section 1

1. Introduction

The popularity of online encyclopaedias as a source of information has increased tremendously in the past two decades. However, the issue of the quality and accuracy of the information available in online encyclopaedias remains one of debate. This is particularly the case in those encyclopaedias available on the internet which do not charge users to access information. There has, however, been much discussion about the accuracy of information available in 'free' online encyclopaedias, which do not pay contributors and editors a fee but instead rely on voluntary contributions from persons who regard themselves experts without formal clarification of their qualifications or a stringent process of peer-review or editing. While this characteristic facilitates rapid and free transfer of knowledge, critics argue that 'opening the editing process to all regardless of expertise means that reliability can never be ensured'[1].

According to the leading global provider of web metrics, Alexa.com, Wikipedia is the most popular online encyclopaedia and the sixth most popular website in the world[1]. It has more than 19 million articles in 270 languages. All content is freely available and approximately 13-15% of global internet users visit Wikipedia each day. Wikipedia is a collaboratively compiled and edited encyclopaedia with contributions in the form of text, pictures, formatting, citations and lists from multiple, unpaid editors and professionals. The process is regulated by means of an explanation of changes made between editors, notability guidelines and a tutorial process for new editors. Disputes about content are usually resolved by discussions between 'Wikipedians', i.e. users, contributors and editors.

In December 2005 the scientific journal Nature reported on a study they had undertaken to compare the accuracy of science entries on Wikipedia with those on the online version of Encyclopaedia Britannica[2]. Unlike Wikipedia, which relies on voluntary contributors, regardless of proven mastery or qualifications, Encyclopaedia Britannica uses selected paid expert advisors and editors. At the time of the Nature study, Wikipedia comprised 3.7 million articles in 200 languages and was ranked the 37th most visited website on the internet[2].

Nature invited independent academic scientists to peer review entries (in the English language) for their particular areas of science expertise, from both Wikipedia and Encyclopaedia Britannica. Each scientist was asked to identify any inaccuracies and comment on the articles' quality and readability, without being aware of the source of the article. Forty-two reviews were submitted to Nature revealing on average four inaccuracies per Wikipedia article, in contrast to three per Encyclopaedia Britannica article. The general response was one of surprise, with levels of accuracy in Wikipedia being better than expected. Wikipedia articles were rated more 'poorly structured and confusing' compared to articles from Encyclopaedia Britannica, with 'undue prominence being given to controversial scientific theories'[2]. Nevertheless, for Encyclopaedia Britannica, the oldest continuously published reference work in the English language, the results were worse than expected[2]. While Jimmy Wales, the co-founder and promoter of Wikipedia, expressed delight, he also added: "Our goal is to get to Britannica quality or better"[1].

In a rebuttal published in 2006, Encyclopaedia Britannica refuted Nature's findings, stating: 'Almost everything about the journal's investigation, from the criteria for identifying inaccuracies to the discrepancy between the article text and its headline, was wrong and misleading'[3]. The rebuttal stated that the conclusion of Nature's report was false, because the journal's research was invalid and clearly stated that the purpose of its production was to 'reassure Britannica's readers about the quality of our (Britannica's) content, and to urge that Nature issue a full and public retraction of the article'[3]. The document highlighted a number of concerns about Nature's research methodology[3] including:

  1. The lack of availability of the reviewers' reports.
  2. The selection of Britannica articles in an unstandardised manner from productions of the encyclopaedia (such as Britannica Student Encyclopaedia and Britannica Book of the Year) rather than solely from Encyclopaedia Britannica.
  3. The selection of only parts and sections of Britannica articles rather than entire entries.
  4. Rearrangement and re-editing of Britannica articles for the purpose of the study, including the merging of passages from two separate articles.
  5. Failure to clarify the factual assertions of the reviewers.
  6. Lack of distinction between minor inaccuracies and major errors.
  7. Clarification that the reviewers' comments were based on facts and not opinions.
  8. Misinterpretation and misleading presentation of the results.

Nature responded by rejecting Encyclopaedia Britannica's criticisms, affirming its confidence in the study, and refusing to retract[4]. Numerous other non-academic and academic publications have followed Nature's example, yielding interesting results. In 2007, a study by Stern magazine[5], compared 50 articles from the German Wikipedia to Brockhaus Enzyklopädie[6], the largest German language printed Encyclopaedia in the 21st century. Fifty articles from disciplines spanning politics, business, sports, entertainment, geography, science, medicine, history, culture and religion were rated by experts for accuracy, completeness, timeliness and clarity. Wikipedia achieved a mean overall score of 1.7 across disciplines on a scale from 1 (best) to 6 (worst), while entries for the same keywords from the paid online edition of the 15-volume Brockhaus achieved an average overall score of 2.7. Wikipedia articles scored higher on timeliness and accuracy than articles from Brockhaus Enzyklopädie, although the Wikipedia articles were judged too complicated for a lay audience.

The accuracy of Wikipedia entries in the sciences has been scrutinised. In a study published in the Annals of Pharmacotheraphy in 2008, Clauson and colleagues found the scope, completeness and accuracy of drug information in Wikipedia to be statistically lower than that in a free, online, traditionally edited database (Medscape Drug Reference [MDR])[7]. In a report establishing the internal validity of Wikipedia entries for 39 of the most commonly performed inpatient surgical procedures in the U.S., 100% presented accurate content while 85% of the entries contained appropriate information for patients[8]. Interestingly, there was a correlation between an entry's quality and how often it was edited. In another case study, medical experts reviewed 35 Wikipedia articles on conjunctivitis, multiple sclerosis and otitis media with entries on similar topics from other popular online resources frequented by medical students[9]. The results found Wikipedia entries to be the easiest resource in which to find information. In addition, although Wikipedia entries were reasonably concise and current, they failed to cover key aspects of two of the topics and contained some factual errors. The report concluded that Wikipedia entries were thus unsuitable for medical students. Nevertheless, in a recent report published in Psychological Medicine, ten researchers from the University of Melbourne concluded that 'the quality of information on depression and schizophrenia on Wikipedia is generally as good as, or better than, that provided by centrally controlled websites, Encyclopaedia Britannica and a psychiatry textbook'[10]. For schizophrenia and depression, two commonly encountered psychiatric conditions, Wikipedia scored highest in the accuracy, timeliness and references categories – surpassing all other resources, including WebMD, NIMH, the Mayo Clinic and Britannica Online.

In one study, among the humanities and the social sciences, Wikipedia was not found to be a reliable source of historical articles, with an overall accuracy rate of 80% compared to 95–96% among the other sources, which included Encyclopaedia Britannica, The Dictionary of American History and American National Biography Online[11]. Wikipedia's performance in articles on Philosophy was found to be mixed in one study, with high rates of coverage and accuracy but high rates of omissions as well[12]. In an impressive review of thousands of Wikipedia articles in political science, about every major party gubernatorial candidate who ran between 1998 and 2008, the author found that Wikipedia was almost always accurate when relevant articles on the topic existed[13]. The coverage of topics was often very good especially for recent or prominent topics, but not as good on older topics. Omissions were, however, found to be frequent.

Prior to Nature's seminal study in 2005, Wikipedia assessed the quality of its entries through its 'featured article' and 'good article' peer review process[14], and more recently through an ongoing pilot study to collect feedback[15], which involves readers and editors rating articles according to trustworthiness, neutrality, completeness and readability, as well as rating their self-perceived qualification to comment. Wikipedia has continued to develop and refine its quality review processes in part as a result of the findings of the Nature study and of other similar studies. However, there has never been any attempt to replicate, better or extend Nature's study, across disciplines and languages. Such a study would not only allow a greater understanding of the accuracy and quality issues pertaining to Wikipedia entries but would also provide information on how such issues may be addressed and/ or resolved.

Recently, Wikipedia's founder Jimmy Wales highlighted the importance of such a task, i.e. a study inspired by the Nature study but employing greater rigour by carrying out the assessment of articles across languages and across a range of disciplines spanning the humanities and sciences, involving the following characteristics:

  1. Assessments carried out by academics and scholars.
  2. Assessments on each pair of articles carried out by multiple expert reviewers to establish inter-rater reliability and eliminate biases.
  3. Reviewers to be blind to the source of the article.
  4. A variety of constructs and dimensions relating to the quality, accuracy, style, references and overall judgment.
  5. Using both quantitative and qualitative rating techniques.

The importance of such a study would lie in the examination of articles in more than just the English language and in subjects other than solely science. This would allow differences in levels of accuracy and quality across languages and subject domains to be identified, which would inform decisions in the future, e.g. for editor recruitment efforts and the design of expert feedback mechanisms.

The size, scope and complexity of undertaking such a study would require considerable preliminary information on the methodology and design, compilation and functioning of rating scales, recruitment and location of the experts, and analysis and interpretation of results. As such it was decided that prior to the commencement of such a study, a small-scale preliminary project drawing on empirical evidence would be essential to determine a sound research methodology, which is the reason that the present study was undertaken.

This pilot study has therefore been carried out to collect and review preliminary evidence to inform the design of a larger, future study. The intention is that the results of this preliminary report will establish the best possible research approach, begin to hypothesise the best way for Wikipedia to measure and communicate the accuracy and quality of articles and provide a well-founded justification for seeking funding for a comprehensive study. This pilot study has been carried out for the Wikimedia Foundation by Epic, in partnership with the Department of Education at the University of Oxford, UK. The methodology, analysis and results of the study are presented in this report, followed by a discussion of the findings and the conclusion of the report.


  1. 1.0 1.1 1.2 http://www.alexa.com (April 2012) Top Sites, [Online], Available at: http://www.alexa.com/topsites [Accessed 12/04/12].
  2. 2.0 2.1 2.2 2.3 Giles, J. (2005) 'Internet encyclopaedias go head to head', Nature, vol.438, 15 December 2005, pp. 900-901.
  3. 3.0 3.1 3.2 Encyclopædia Britannica, Inc. (March 2006), Fatally flawed: refuting the recent study on encyclopaedic accuracy by the journal Nature, [Online], Available at: http://corporate.britannica.com/britannica_nature_response.pdf [Accessed 11/03/11].
  4. Nature (23 March 2006), Encyclopaedia Britannica and Nature: a response, [Online], Available at http://www.nature.com/press_releases/Britannica_response.pdf [Accessed 11/03/11].
  5. http://www.stern.de/digital/online/stern-test-wikipedia-schlaegt-brockhaus-604423.html
  6. http://www.brockhaus.de/enzyklopaedie/30baende/index.php
  7. Clauson KA, Polen HH, Kamel Boulos MN, Joan H Dzenowagis JH. Scope, Completeness, and Accuracy of Drug Information in Wikipedia. Ann. Pharmacother. December 2008 vol. 42 no. 12 1814-1821
  8. Devgan L, Powe N, Blakey B, Makary M. Wiki-Surgery? Internal validity of Wikipedia as a medical and surgical reference. Journal of the American College of Surgeons 205:3, September 2007, Pages S76–S77
  9. Pender M, Lasserre L, Kruesi L, Del Mar C, and Anaradha S. 2008. Putting Wikipedia to the Test: A Case Study. Paper presented at to the Special Libraries Association Annual Conference, Seattle, June 16.
  10. Reavley NJ, Mackinnon AJ, Morgan AJ, Alvarez-Jimenez M, Hetrick SE, Killackey E, Nelson B, Purcell R, Yap MBH and Jorm AF. Quality of information sources about mental disorders: a comparison of Wikipedia with centrally controlled web and printed sources. Psychological Medicine, Available on CJO 2011 doi:10.1017/S003329171100287X
  11. Rector LH. 2008. "Comparison of Wikipedia and Other Encyclopaedias for Accuracy, Breadth, and Depth in Historical Articles." Reference Services Review 36 (1): 7–22.
  12. Bragues G. 2007. "Wiki-Philosophizing in a Marketplace of Ideas: Evaluating Wikipedia's Entries on Seven Great Minds. Working paper. http://ssrn.com/abstract 978177.
  13. Brown A. Wikipedia as a Data Source for Political Scientists: Accuracy and Completeness of Coverage. World Politics 63:1, 2011.
  14. Wikipedia (2011) Featured articles, [Online], Available at http://en.wikipedia.org/wiki/Wikipedia:Featured_articles [Accessed 11/03/11].
  15. Wikipedia (2011) Article feedback, [Online], Available at http://www.mediawiki.org/wiki/Article_feedback [Accessed 01/07/11].