2007 Testimony by Jimmy Wales to United States Senate
Mr. Wales, thanks for being here. We welcome your testimony now.
TESTIMONY OF JIMMY WALES, FOUNDER, WIKIPEDIA
Mr. Wales. Thank you. My name is Jimmy Wales and I am the founder of Wikipedia, as well as founder of the non-profit charity The Wikimedia Foundation, which hosts the Wikipedia project and several other related projects.
I am grateful to be here today to testify about the potential for the Wikipedia model of collaboration and information sharing which may be helpful to government operations and homeland security.
To introduce this potential, I would like to first talk about our experience with Wikipedia. The original vision statement for Wikipedia was for all of us to imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That is what we are doing.
Wikipedia currently consists of more than nine million encyclopedia articles in more than 150 languages. While the English project is the largest, with over 2 million articles, this represents less than one-fourth of the total work.
Wikipedia is currently increasingly important around the world with more than half a million articles each in German and French, and more than 250,000 articles in several additional European languages, as well as more than 400,000 articles in the Japanese language.
Despite being blocked in the People's Republic of China for the past 2 years, the Chinese language Wikipedia, which is primarily written by Chinese speakers in Hong Kong, Taiwan, and around the world, is a healthy community project with more than 150,000 articles and a strong growth rate.
At a time when the United States has been increasingly criticized around the world, I believe that Wikipedia is an incredible carrier of traditional American values of generosity, hard work, and freedom of speech.
Now I would like to talk a little bit about how open, collaborative media like wikis enable more efficient gathering and dissemination of useful information. Although it may be counterintuitive that opening up a wiki project leads to a more useful compendium of information, that is what our experience has been with Wikipedia. And I believe that experience can the same for government agencies and operations, as well.
The method of production for Wikipedia is highly innovative. And in keeping with the old adage, necessity is the mother of invention, the story of how Wikipedia came to be is, I hope, both instructive and entertaining.
Wikipedia was born of the famous dot-com crash. In the early days of the project, we worked together as a community with only a shoestring budget. If the financial climate had been better, then I would have likely turned to hiring employees to fill some critical functions. But because investment money and advertising revenue had completely dried up, we were pushed to find new solutions, solutions of community institutions to manage processes that would have been traditionally handled in a top-down manner.
As a result, we pushed the limits of the new Internet medium to create a new kind of community and a new kind of encyclopedia, one controlled by volunteer administrators and editors working together in a grand global conversation to create something new.
According to firms that measure Internet usage, Wikipedia is now the eighth most popular website in the world. And yet despite competing in some sense with companies with billions of dollars to invest, Wikipedia survives on an incredibly modest budget. Last year we spent around $1 million and although this year we are spending a bit more, our budget is still minuscule compared to that of most other tech enterprises, even if you limit the comparison to other top websites.
The First Amendment plays an important role in this project, as do traditional American ideas of individual responsibility. Under U.S. law, everyone writing in Wikipedia takes responsibility for his or her own actions, just as it is true of everyone speaking in any public forum. The maintainer of this forum, the Wikimedia Foundation has set down some fundamental codes of conduct including, but not limited to what constitutional scholars call time, place, and manner restrictions. And I have personally imposed policies which strive toward respect for others, quality writing and the citing of sources.
It is counterintuitive to some that an open discussion with virtually no top-down command and control structures can generate a high quality encyclopedia. Nevertheless, it does.
To illustrate our success improving the quality of Wikipedia, we are currently celebrating a study published in the German weekly news magazine, Stern. According to this study, which just came out last week, Wikipedia scored higher in all but one categories than the standard German encyclopedia Brockhaus. The one standard we fell a little bit short on was readability. I promise, we are working on that one every day.
Now given that Wikipedia is a public enterprise open to the entire public for collaboration and contribution, you may be wondering how wikis or the Wikimedia model may be useful to government. First of all, I want to note generally that there are other ways in which a wiki can be set up usefully, including set ups that do not involve opening the wiki to the general public. You can control access, and a wiki might be useful to an agency that wants to facilitate information sharing up and down the hierarchy for increased vertical sharing. And controlled access wikis can be set up to share inter-agency information, so increased horizontal sharing, as well.
The main point here is there is no requirement of necessity for the tool of a wiki to be open to the general public in order for it to be useful. The word wiki comes from a Hawaiian word wiki wiki, meaning quick. The concept of a wiki was originally created by a famous programmer named Ward Cunningham, who lives in Portland, Oregon. The basic idea of a wiki is quick collaboration. When people need to work together to produce some document, the only option in the old days would be to email around a text file or word processing document. The wiki represents a crucial innovation allowing for much greater speed. The most basic idea of a wiki is a website that can be easily edited by the readers, but modern wikis contain simple yet powerful features that allow for the users to control and improve the quality of the work.
Wikipedia represents the power of a wiki open to the general public, but I believe the same wiki technologies that powers Wikipedia is also being widely adopted inside many enterprises. And I will note here in passing a couple of examples of this innovative use, one in private enterprise and one in the U.S. Government.
First, consider Best Buy. Recently great companies such as Best Buy have been using wiki technology across the enterprise to foster faster information sharing and collaboration inside the enterprise. To give a hypothetical example of how this works, imagine the car stereo installer in a Best Buy store in Florida who discovers a faster or easier way to install a particular brand of stereo. This information can now be shared directly peer-to-peer to other stereo installers within the company across the entire store network. In the past, this kind of local information discovery was lost or isolated.
One Harvard professor's research suggests that one key to successful use of new technologies is adoption. The tools must be easy to use and valuable in the day-to-day life of those using them.
Now I will take a quick look at Intellipedia. I am not an expert on intelligence gathering, so I will simply quote a useful resource, Wikipedia, regarding Intellipedia. The Intellipedia consists of three wikis. They are used by individuals with appropriate clearances from the 16 agencies of the U.S. intelligence community and other national security related organizations including combat and commands and Federal departments. These wikis are not open to the general public.
The Intellipedia uses Mediawiki, which is the same software used by Wikipedia, and the officials who have set up the project say that it will change the culture of U.S. intelligence community which have been widely blamed for failing to connect the dots before the attacks of September 11, 2001.
Tom Fingar has gone on record describing one of Intellipedia's intelligence successes. Mr. Fingar told DefenseNews.com that a worldwide group of intelligence collectors and analysts used Intellipedia to describe how Iraqi insurgents are using chlorine in IEDs, improvised explosive devices. They developed it in a couple days interacting in Intellipedia, Mr. Fingar said: No bureaucracy, no "mother may I," no convening meetings. They did it and it came out pretty good. That is going to grow.
As you can see, just as the dot-com crash forced private industry to think about more efficient and effective ways to use digital technology, the attacks on the United States forced our intelligence community to explore innovate ways to share intelligence among agencies.
This brings us back to what might be called the lesson of Wikipedia, that an open flat forum allowing many stakeholders to participate can facilitate information sharing in an extremely cost-efficient manner and it can take advantage of a wider range of knowledgeable people than traditional information sharing processes do.
Good democratic governments strive to be responsive to citizen's needs. In order to do so, it is important that governments use technology wisely to communicate with the public and also to allow the public to communicate with the government.
It is my belief that the government of the United States should be using wiki technology for both internal and public facing projects. As with any large enterprise, internal communications problems are the cause of many inefficiencies and failures. Just as top corporations are finding wiki usage exploding because the tool brings about new efficiencies, government agencies should be exploring these tools, as well.
The U.S. Government has always been premised on responsiveness to citizens, and I think we all believe good government comes from broad, open public dialogue. I therefore also recommend that U.S. agencies consider the use of wikis for public facing projects to gather information from citizens and to seek new ways of effectively collaborating with the public to generate solutions to the problem that citizens face.
Thank you for inviting me to testify about the potential for the Wikipedia model to improve our government's ability to share and gather information for increased security, for increased governmental responsiveness in our open society, and for the preservation of democratic values. Thank you.
Chairman Lieberman. Thanks, Mr. Wales. That was great, and necessity mothered a great invention.
I love the story of the founding of Wikipedia. And if I may say so, and I appreciate your saying that in some ways, it is classically American. And this is a part of American history, part of the American experience that goes right back to the beginning. It always struck me as instructive, that among the founding generation of Americans were some remarkable inventors, beginning with Franklin and Jefferson. And obviously this continued, in many ways, throughout our history with the extension of the American frontier and all of the advances that have occurred since.
But you have really done that, along with your colleagues, in this age. And so I thank you for it.
And I thank you for the suggestions that you have made about how this collaborative technology can help us, in government, do our job better. I want to come back to that in a few minutes. But let me start with this question that we were focused on in the reauthorization, which is the problem of access through search engines.
Mr. Needham, you testified that in whole or part, there are 2,000 Federal Government websites that are not included in search engine results. And I wanted to ask you why, and to expand a bit on what you said. I know you had a reference to EPA and NIH. But is it accidental? Is it that they are not going the extra mile to make this happen? Or is there some policy? Or is it plain laziness that is bringing us to a point where we should not be?
Mr. Needham. Right, happy to speak to that.
So I think the principle factor that we can look to is that governments produce lots of information and have a mandate to disseminate that information. And to do that, agencies rely on large databases to hold public records and present government programs to citizens. So that is one factor, lots of information, hard to disseminate it efficiently.
But these databases, the EPA example I gave and many others that I could point to at the Federal, State, and local level. They typically present to the user a search form by which that user then types in key words to find the report they are seeking, or the record or what have you. These search forms cannot be navigated by search engine crawlers. We cannot reach behind to see what is there and add those records to our index.
And because, as we have experienced in our communication with agencies, they tend not to think as much about how citizens are going about finding information but rather about how their website is presented to citizens, they have not taken, by and large, this step of providing us a means of finding those records behind that search form. And that is what the Sitemap Protocol technology enables. It provides the agency a simple mechanism for pointing out to a search engine crawler, these are all of the records in this database, come here and crawl them.
Chairman Lieberman. I appreciate the answer. So you have now created a tool that should make it much easier for Federal Government websites to be included in search engine results.
Let me go back to you, Ms. Evans, and ask you if you want to add at all to Mr. Needham's answer about why some agencies have not made their web pages available to commercial search engines?
Ms. Evans. Well, I think Mr. Needham hit on the first issue about it is a lot of information and therefore we try to figure out the best way to efficiently deliver it, which is through databases and organization that way.
The other part, which Mr. Needham has also highlighted, is the partnership that we need to work out with commercial search capabilities, because many times when we start delivering these next level of services, what we have also done is streamlined the support services associated with those.
So in talking with Google and other search companies, we try to present the information in a context. We may not be doing it in the most efficient way to provide a context around it. For example, on GovBenefits.gov, when we initially talked, we have a whole support mechanism behind that. So we try to filter so that we do not create frustration in the citizen as well and present a whole series of results to them. And what they do is they see what they are basically eligible for.
Now what we need to do is work in partnership, which is what your reauthorization allows us to do, is that there is a balance between us trying to streamline our backline and making sure that the citizen really knows what they are eligible for and also making the information commercially available to the search engines, because it is all out there. We just want to make sure that we are providing it in a context so that we do not create more frustration.
Chairman Lieberman. So leaving aside matters that we might not want to have easy access to, because of privacy concerns or classification, is there any other policy reason that would justify limiting access to otherwise publicly available information on Federal websites through search engines?
Ms. Evans. No, sir. The way that we have put together the policies, but we would need to go back and relook at that to see of any agencies may be interpreting it that way. But the way that the policy is based on the current E-Government Act, it is for greater dissemination of public available information. And then also, the Administration has passed an Executive Order, again supporting the Freedom of Information Act, saying look even further at your information and make this available before it is asked for.
So that information is out there. But we do have to do it in a way that it is easily accessible through the means that citizens research and look for the information.
Chairman Lieberman. I appreciate that answer because that is certainly our intention, that except for privacy and classification reasons, everything else should be maximally and as easily as possible available to the public including through search engines.
Mr. Schwartz, do you want to add anything to this discussion?
Mr. Schwartz. I think Ms. Evans made a good point about the context issue. I think it is an important one. But I also think that the American people are smart and they know how to use search engines. While it may be frustrating a few people, we do not want to block information from the vast majority of people who would be able to figure out the context and use that information. For the minority of people that cannot figure it out, that might be frustrating, but at least they have an opportunity for access.
So I think that there is sort of a balance there of how do you give the right context but make the information as maximally available as possible and maximally searchable as possible.
Mr. Needham. If I may add a comment on this, as well.
Chairman Lieberman. Please.
Mr. Needham. We worked with the State of Arizona earlier this year to open up eight of their major databases, which were not initially designed to be crawled by search engines. And the pages that they present to users, indeed may not be utterly clear to the user on first blush, but they did so. They opened these databases. And as we indicated in a report we published or a case study we published yesterday on this website I referenced, Google.com/publicsector, the administrators of these agencies, whose databases were opened, are very pleased with the results of citizens for the first time learning about, for example, license record of contractors and of real estate developers, and so forth. So we know it can work because we have seen it work.
Chairman Lieberman. Good. That is a good example. Mr. Wales, do you want to get into this?
Mr. Wales. Yes. Actually, in many cases we have heard about the difficulty when the web crawler comes to a form and they have to, instead of clicking on something that says Alabama, someone would have to type in Alabama and the search crawler is not able to figure out what to type there.
But we actually find even in the Wikipedia context, which is written by human beings, that there are some websites that even when you type in the right thing and you submit and you get the information you want, you cannot link directly to that. And so someone who is writing something, trying to explain something, and they want to link to a particular statute or a particular regulation or a particular piece of information that has been published by the government, if they are not able to cut and paste that URL and put it into Wikipedia, then even with a human involved it is very frustrating. The only thing you can do is give someone instructions. Go to this page, type in this, select the third link. It can be very frustrating.
Chairman Lieberman. That is a good point. Ms. Evans, do you want to respond?
Ms. Evans. Well, all I can say is that we are very open to making this more collaborative. We have examples, and I would like to actually share one, that we are embracing this technology and we are using it more ourselves. The EPA was raised as an example here of not making information available. But they recently held what they called the Puget Sound Partnership where they went and for 36 hours they worked directly out there trying to figure out how to do the information, using the technology. What parts of their information were not easily accessible? Could they set up these pages? Could they do all that?
We are taking those lessons learned there. Molly O'Neill from EPA, the CIO from EPA is sharing that now with the other CIOs. So that we can take these types of things and the frustration of the information that we are putting out there and then try to fix it so that we can make sure that it is readily available.
Chairman Lieberman. OK. That is an encouraging example.
As you know, Section 204 of the E-Government Act required the development of an official Internet portal that would be organized by function or topic instead of the boundaries of agency jurisdiction. That is USA.gov which now receives, I gather, almost 1.9 million visits per week.
I wanted to ask you if you have done any work that would enable you to tell us how you think a user's experience is enhanced by using USA.gov instead of attempting to find their information through search engines?
Ms. Evans. Well, the way that USA.gov is set up--GSA really manages this very well, at least we think they manage it very well. They do hold user focus groups constantly throughout the year to really measure the customer experience, the citizen experience and how to reorganize it.
This is a good example of us and our interpretation of putting the context around the Federal Government information and then trying to give the citizen an enriched experience when they come and that it is the authoritative source for the Federal Government launching off of there, that you are going to authoritative sites.
They did a lot of market research, it used to be called FirstGov.gov. They did research just in the name itself and changed it based on their market research and changed it to USA.gov. And just the simple name change of that increased the usage by 67 percent.
So they are constantly looking at customer satisfaction. All of the E-Government initiatives are measuring customer satisfaction usage. And then as well as how we can go back and really deploy it and improve it and we make all of those metrics publicly available.
So we set targets. Several of the initiatives may not necessarily be meeting their targets. But we do set the targets and the metrics and we do publish our actual performance against those metrics.
Chairman Lieberman. Do you think enough people know? The numbers are pretty good obviously, but do enough people know and take advantage of the service that USA.gov provides?
Ms. Evans. I think that if I had--I do it myself. So I will be honest--I go to Google and then I go to USA.gov when I am looking for specific things. But I launch into Google or Yahoo! or Ask.com, just like anyone else does. Because I want to see how my services come up.
But I will tell you that the other benefit to having the government initiatives such as the Federal Internet portal and USA.gov was those services were available when crisis and things happen within the Federal Government that we have to mount an immediate response because the infrastructure is already there. USA.gov, because of its integration of the services, was able to provide support services to the State Department like answering passport questions. They can build out that and complement what the State Department is doing.
As a matter of fact, they actually answer all of their calls now. We get a common set of answers because USA.gov is tied into every agency, so they handle all of the misdirected e-mail. So if anybody does write directly to a department or an agency, it is automatically routed to their set of agents so that they can answer the questions on a consistent basis.
So there is a lot of integration of back end office types of services that we have done through these government-wide initiatives that when something happens like the VA situation where we lost that data, USA.gov and those services built up within 48 hours. They had the capability and they put all of that information out on their website. They had RSS feeds set up, which are automatic sign ups so that people can get the update of the information as we changed it. And we also had 1- 800 service so that we could answer 240,000 calls a day for the veterans.
So we tried to put all of that together as an integrated channel so that we are providing the solutions to the citizens. So it is a more complicated question than does everyone know USA.gov?
Chairman Lieberman. I will tell you that in preparing for the hearing we went to Google and typed in Federal Government, and USA.gov came up first in a number of listings.
Let me go to another provision of the E-Government Act and see if I can start a discussion with the four of you, but I will start with you, Ms. Evans. In one provision of that Act, we require the development of a system for finding, viewing, and commenting on Federal regulations. This was really a step forward, obviously. The goal was not just transparency, but real accessibility to give individual citizens the opportunity that they would find very difficult under the previous technology to both see proposed regulations, gain access to them easily, but actually then to comment on them.
From what I can see, while there has been progress, I have been disappointed that the development of Regulations.gov has not opened up the rulemaking process to a greater degree. CRS, which we referred to earlier, recently reported, "It still appears that relatively few comments have been coming to the agencies via Regulations.gov compared to other methods of comment."
Further, in relation to what we have been discussing today, the data in Regulations.gov cannot be found by outside search engines.
So give us your status report on how Regulations.gov is doing and tell us whether you agree that more needs to be done to facilitate public access to tracking and ability to comment on regulations.
Ms. Evans. Sir, the short answer is yes, sir, more needs to be done on Regulations.gov.
The other part of it is about searching and doing the docket systems that are back within the agencies and making that information available. Again, this would be one where we would have to partner with commercial search providers about the best way to make that information available because we know that is a limitation right now within it.
Agencies do have to post all of the regulations, proposed rules at Regulations.gov, but what we wanted to do was make sure that the public had the availability to comment through multiple channels. So the comments can go directly to an agency, not necessarily all comments have to come through Regulations.gov. And that was flexibility that the agencies still wanted to maintain.
Some of the things that are being looked at with Regulations.gov because this is really not a technology issue. This is really looking at how do we want the business of rulemaking to evolve? Some of the basic things that I have asked as the technology is going forward is do more comments make a better rule?
Those are things that I think the way the technology is working and that you see through the development of functions like Wikipedia that there are arguments on both sides of that. And that is what needs to be looked at. We are in partnership, we jointly manage that with the OIRA Administrator, Susan Dudley. And these are things that she is embracing because she does want more transparency, she does want more openness in the regulation process. And so we are working with that.
There is an ongoing study right now with the American Bar Association that we have been meeting with them of improvements and requirements, some things that we can do to Regulations.gov that would just make it easier to use so that more people would want to put comments in there, as well.
Chairman Lieberman. Mr. Wales, would you say, based on the Wikepedia experience, that as a matter of policy or that we could conclude that more comments would make better rules?
Mr. Wales. I think so, yes. But I think one of the interesting things about Wikipedia, what is innovative about the wiki technology is rather than just commenting, people are collaborating and finding ways to compromise.
Chairman Lieberman. Yes, good point.
Mr. Wales. And so, there are some very practical problems, of course, that are faced with open commenting, spammers, crazy people, and all kinds of bad behaviors. And you have to think how do you balance the desire for allowing the general public to comment and not to censor their remarks because maybe somebody does not agree with them versus well it is not censorship to say, links to Viagra advertisements is not really a comment on most regulations, anyway.
And so, I think that these things do take very careful study. People can be very simplistic and say well they, should allow the public to comment on regulations. Well, sure. But how are we going to help the public to come in as a part of a responsible community and do that in a way that everyone finds useful.
Chairman Lieberman. Good, thoughtful answer. Mr. Needham, any thoughts about this question about how we can improve basically Regulations.gov?
Mr. Needham. Well, you are correct, that this is an example of an E-Government program website, among the many that I have referred to, that are not visible to search engine users. And this, I think, is more of a comment on the USA.gov discussion earlier, that let us say that someone is a farmer that grows tomatoes in Florida is not too plugged into the regulatory process that governs that industry and searches on Google for "tomatoes transport." If this resource were crawled and indexed and integrated in search engines, including USA.gov, this grower might be more engaged in that regulatory process, learn that there is, in fact, a rule that is under comment.
And the point being made here is that not every citizen realizes when they are looking into their health, their business, education, or housing, that government provides a service that is relevant. And that is why it is critical that all of the information of the Federal Government that is public be in all search engines possible and not simply through USA.gov, where a user is consciously looking for information from its government.
Chairman Lieberman. Well said. Mr. Schwartz.
Mr. Schwartz. I want to take on two different points there, the first one that Mr. Needham just raised. In terms of when we first did our report looking at what kinds of searches were not coming up, it was during the polar bear comment period that the Interior Department was having, that had more comments than any other commentary.
Chairman Lieberman. Whether the polar bear was going to be listed as an endangered species?
Mr. Schwartz. Exactly. And we did some searches on that and you could not find that on any search engine at all. It was one of the first things that really got us interested in this issue. This was one of the best known comment periods in the history of the Federal Government. I mean, the most activity in terms of comments and you could not find it on a search engine, except for going through secondary parties. And part of that was because it was not on Regulations.gov.
Eventually GPO sitemapped their site and then you could at least find it through GPO. But that is just one example. You know there are people that are searching for this in a way that, where they hear on the news that there is a comment period on whether the polar bear should be endangered species and they want to comment. They go to search on Google, they do not get the result that they expect.
Regulations.gov shows that concern very acutely.
I want to follow a little bit on Mr. Wales comments and Ms. Evans comments about Regulations.gov. We were hoping by this point that we could be at the point where we were trying out new technologies for regulations and public comment periods.
Chairman Lieberman. What were you thinking of?
Mr. Schwartz. I mean, using the wiki model. A lot of people would think of it as, oh you just put up a rule and then people go and attack it and you get both sides. But the really interesting thing about what happens on Wikipedia is the commentary pages and the notes pages, which are much more similar to a traditional rulemaking than you would think.
If you go through and look through how they go about making determinations and people giving justifications based on facts and what the rules are for how that is done, I think we could learn a lot from just trying out new technologies. Not saying that it should supplant the old ways of rulemaking. But perhaps we can, in certain kinds of rulemakings, we can come up with a more collaborative discussion rather than the traditional conflict policy that kind of governs public comment periods today.
Chairman Lieberman. That is very interesting.
You know there is another institution around Washington that needs more collaboration to be effective, Congress. Maybe we should all form a Congressipedia.
Another thing we do not do here, if I may continue this particular flight, gaining now with the welcoming our colleague from Hawaii. You told us the word wiki is Hawaiian for quickly, that one thing that we do not do enough around here is to legislate wikily. So anyway, I welcome Senator Akaka.
I am going to ask one more series of questions and then I am going to yield to you, Senator Akaka. Thanks for being here.
I want to go directly to you, Mr. Wales, and thank you again for being here, to take up one of the--I guess it is a criticism, a skepticism about Wikipedia, which is that inaccurate content can result when larger numbers of participants outweigh the contribution of a few experts.
In your testimony, you said that controls or kind of management devices can be put in to provide--I like the term-- fine grain control to access and edit information. And I wanted to ask you to elaborate on that, particularly, but generally with regard to Wikipedia but also as it may effect collaborative technologies to be used by the Federal Government.
Mr. Wales. Absolutely. So within Wikipedia, the software, the Mediawiki software that we use puts several tools into the hands of the community so that they can manage the quality of the content. Within the community, there are administrators who are elected from the community and they are generally chosen after they have proven their worth over a period of time in terms of being good writers, thoughtful editors, kind and helpful to others, the kinds of values that we look for in an administrator.
And the administrators have the ability to do things like temporarily lock pages. We can do that in a couple of different ways. One of the ways that has been very successful is what we call to semi-protect a page, which means anyone can edit that page as long as they have been around and had an account at Wikipedia for 4 days, a very low threshold for entry into participation. But this really helps us in cases where a particular article has been highlighted in a news story or something like this and there are a lot of newcomers coming in and things like that.
Certain articles on very controversial topics tend to be semi-protected pretty much all of the time. An example would be George W. Bush, for example.
Chairman Lieberman. Right. Let me understand, this is really interesting. The administrator is empowered to essentially make a judgment call if the administrator thinks that a page may be subject to piling on or any thing else, because it is controversial?
Mr. Wales. That is right. And a lot of times we try to keep this to be something of a cooling off period. In other words, something has been in the news, we will semi-protect it for a few days until everybody relaxes a little bit. And there are over a 1,000 active administrators in the English Wikipedia. And of course, they have conversations and discussions and disputes amongst themselves over whether things should be protected or unprotected.
Occasionally some brave soul will say I think we should unprotect the George W. Bush article, they unprotect it, and say I will watch and make sure there is no vandalism. And usually about 6 hours later they are exhausted and protect it again and go to sleep.
So there are some areas of high potential for pranksters and people like that, that end up semi-protected most of the time.
We also have the ability to block IP numbers. So if there is some form of misbehavior and where it is coming from--the typical case would be a high school, a parliament building, this sort of thing. That's a joke, actually, although it has happened.
We will see some sort of juvenile behavior. And normally what we do in a case like that, is we just simply block that IP number from editing Wikipedia for 24 hours or so. Hopefully that will just calm them down. So that is another sort of tool in our pack.
We have things like recent changes, so there are people who monitor every change that is coming in. Individual users have personalized watch lists. So if you are a particular expert on birds, for example. I met a scientist at Cornell University who is an ornithologist. He monitors a lot of the bird articles. He does not have time to do it personally everyday, but about once a week he said he comes in and checks out a lot of the bird articles. And he can quickly look at the change, just the change in the article. Rather than him to reread the whole thing from scratch, he can quickly see what has changed since the last time he has been there to make sure it seems suitable to him.
So all of those kinds of tools are important, but probably one of the most important tools of all is that the entire history if every article is kept in the database with very rare exceptions. Occasionally, we completely delete things from the database, privacy violations or other legal reasons. But typically if it is simply a bad version of an article or something like that, the old versions are there. And so if somebody comes in and begins to damage an article, it is typically one click for anyone to go back in and save the previous version as the current version. And so it is hard to do any damage at Wikipedia. Whenever you come in and make a change, you are actually just creating a new version. And if you have done some harm, someone can quickly come behind you and fix it.
Chairman Lieberman. Very interesting. I presume though, it is a different kind of activity that you would say some of those methods you have for protecting the integrity of the system are also relevant for collaborative technologies used by the Federal Government?
Mr. Wales. Absolutely. Some of these techniques are not necessarily as useful in internal facing wikis. If you have an internal wiki and everybody who is editing it is logged in and they are an employee, typically you do not need to block them from editing. You fire them or whatever you need to do to tell them to stop misbehaving.
But other of the tools, for example, the history. You can easily have people who disagree and someone will say you made these edits to this article, but I do not feel that it really improved it. I am going to go back to the previous version and then let us go to the talk page and hash this out.
So these kinds of tools are applicable for internal wikis and external, but a lot of the concepts may be valuable outside even the wiki framework. The idea of understanding that if you can generate a thoughtful community, you can have that community do a lot of the policing that otherwise it would not be cost effective to do.
A similar example would be Craig's List. People post advertisements there, free advertisements. And the staff at Craig's List is really too small to really supervise and monitor everything. But their community can simply, if you see something that is spam or is somehow inappropriate, they can simply flag it and if it gets flagged a certain number of times it just disappears. Overall, this does a pretty good job. And those are the kinds of techniques that I think we are going to be exploring in the industry over the next few years.
Chairman Lieberman. That is fascinating and encouraging because there is a kind of confidence there based on some experience you have had that in the end the better part of human nature prevails.
Mr. Wales. Well, one of the classic examples I always give is to imagine that you are going to design a restaurant. And you think to yourself in this restaurant we are going to be serving steak. And since we are serving steak, the customers will have access to knives. And when people have access to knives, they might stab each other. So to design our restaurant, we are going to put everybody inside a cage.
Well, this makes a bad society. That is not the kind of open society we want to live in. But unfortunately, when people are engaging in web design, this is often exactly the kind of thinking that they have. They think of all of the bad things that people might do and design everything around those worst case scenarios rather than saying, oh you know what, let us keep things as open as we can and wait until we see the bad behavior and then think about what to do about it. We call the police. We get an ambulance. Or in a digital context we simply change it back to the old version.
Chairman Lieberman. I am going to stop now and yield to Senator Akaka. Thanks again for being here.
- ↑ The prepared statement of Mr. Wales appears in the Appendix on page 85.
This work is in the public domain in the United States because it is a work of the United States federal government (see 17 U.S.C. 105).