Wikidata deployment edit

 
Wikidata logo

Wikidata, the knowledge base project run by Wikimedia Deutschland (WMDE), began integrating with Wikisource on 14 January. This was the first phase of the deployment, in which Wikidata takes over inter-project links and centralises them within its own database. The deployment was not universally accepted on Wikisource and a vote to allow blocking of any bots making Wikidata edits was supported. However, in practice, the deployment went ahead with few problems and Lydia Pintscher, Product Manager for Wikidata at WMDE, remarked that the first phase had "gone rather smoothly". The second phase of the deployment, in which Wikisource will be able to access information held by Wikidata, is scheduled for 25th February.

The following sections are excerpted from the full report.

Introduction edit

When we started devising the plan for the IEG project, the only certainty was that we were starting an uncertain endeavor, with many hidden variables and unknown dangers, which didn’t let us make as many specific promises as we would have liked. However, 10 months after these initial steps, we can state confidently that our aim of generating a community momentum has been successful. Future successes will depend on how much the community decides to get involved on a common Wikisource path.

Summary edit

  • Community contact through several means: hundreds of emails, discussions, Skype calls, IRC and IRL chats.
  • Outreached to the wider Wikimedia community with our blogposts on Wikimedia blog.
  • Outreach to librarians, scholars, open knowledge activists and developers (also in conferences like: Wikimania, Amsterdam Hackathon, LODLAM, OAI8).
  • Networking people who worked on the same or related subject, but didn't know each other.
  • Establishing initial contacts with potential global partners (Internet Archive, Open Library, Google Books and others)
  • Leading book metadata implementation in Wikidata, with now a fixed date for deployment: Jan 2014
  • Revitalization attempt (successful) of the Wikisource mailing list.
  • Identification of issues that need a technical approach, proposals and assistance to Wikisource-related GsoC projects.
  • Foundation of the Wikisource Community User Group, which led to a community survey and an international proofreading contest.

Outcomes and impact edit

Some of the results are:
  • Wikidata Books task force: the community has been discussing all the book-related properties on Wikidata, as this will affect several Wikimedia projects (Wikipedia, Wikisource, Commons). Many properties (a core set) have now been created. Integration with Wikidata will happen in January 2014. A study of the integration has been done.
  • Wikisource User Group: the groups has been accepted by the Affiliation Committee, and we have a list of 40 participants so far. Via the User Group, two main accomplishment has been made:
    • Wikisource community survey: in order to get to know what the community considers prioritary for Wikisource and its user group, everybody was invited to take this survey. The idea of the survey started on the Wikisource mailing list on Sept 19 (see initial message by Aubrey) and after a lengthy discussion and it being translated into 11 languages by volunteers, it went public on Oct 14 2013 for 2 weeks (see message). For the results see: Wikisource survey report
    • Wikisource 10th anniversary proofreading contest, run by volunteers in Italian, Catalan and English Wikisource, from 24th Nov 2013. More information in this blog post. The results of the contest still have to be validated, but for the Italian project this was a huge and unexpected success: over 4000 pages, in only 7 days, have been validated. The Catalan edition has also *exploded* all expectatives. Being centered around proofreading instead of validating, the participants have proofread durint a week almost as much as communities 10 times larger.
  • The Google Summer of Code grantees were working in their respective projects and they had a clear view how their project relates to the Wikisource workflow.
  • Reviving the number of messages on the Wikisource Mailing list (see the archives): February: 2, March: 11, April: 7, May: 47, June: 112, July: 23, August: 18, September: 32, October: 50, November: 95.

Comments on stated measures of success edit

These are our comments on the initially stated measures of success:
  • Regarding “Engage at least 10-20 people across all Wikisources during vision drafting”: in the end, instead of working towards a document, we decided to work towards action. More than 20 people have been engaged in talks, discussions, etc. and around 40 people joined the application process for the WCUG.
  • Regarding “Support of the vision document by at least 50 volunteers”: since we shifted our initial goal, then we consider the participation in the survey as equivalent to this. Around 250 people answered it.
  • Regarding: “Finish all documents on time”, well, we hope that we are on time :)
  • Regarding: “The drive generates at least X projects…” there have been 4 GsoC, plus 2 IEG applications related to Wikisource, additionally volunteers have shared new scripts on the mailing list (thanks to Alex Brollo, and to Joan Creus).
  • Regarding: “The vision is presented to the Wikimedia movement…” this is a task for the newly founded WCUG and we are sure that it will be done since it is the wish of the community too.
Strategic impact
Option A: How did you increase participation in one or more Wikimedia projects?

Option B: How did you improve quality on one or more Wikimedia projects?

Option C: How did you increase the reach (readership) of one or more Wikimedia projects?

We can say we worked mainly on necessary condition for A, B, and C to happen, which is the coordination of Wikisource language communities, with the aim of fostering the creation of an active group of international Wikisourcers. This has been a success: we took the first steps for the recognition of the community-supported Wikisource Community User Group, and thanks to this, also the communication on the offical Wikisource mailing list increased. Only time will say if the actual participation or quality or readership on Wikisource will increase. For sure, all the actions undertaken by this IEG project were aimed to all three of the objectives.

  • Google Summer of Code, Wikisource survey and anniversary contest were directed to ease the work on Wikisource, thus increasing participation. For example, the Italian Wikisource contest brought in more than 160 new editors.
  • Outreach activities and discussions were aimed to increase quality of the content (via GLAM parnerships).
  • The Wikisource anniversary competitions, the use of Facebook and Twitter, and the blogposts on Wikimedia blog were done with the intention of increasing readership (and some of them also participation). In the Italian Wikisource, the readership increased of thousands of visitors, same as with the Catalan contest.
Additional impact
We did not anticipate to have impact on software development. The Google Summer of Code projects started few very interesting projects, and we hope those could be finalized and will provide a better infrastructure for Wikisource. The refactoring of Proofread Extension will be deployed soon.

Key Learnings edit

What worked well
As stated during the midpoint report:
  • GsoC proposed projects, although only one is going to be deployed so far
  • Blog posts announcing Wikisource progress
  • Wikidata involvement, a key partner for many future plans

Also during the second term we think that the conducted survey, started by the WCUG and supported by us, will help reach a better understanding of who are wikisourcerors and what they want. Whoever who wants to “do something for the community”, now can take a look to their answers and gain legitimate insights about what is acceptable and what is really wished for. Also, by asking the participants about some topics, we raised awareness about what is coming and what could come.

However, we are also aware that this “excitement-building” is a two-edged sword. If the WCUG and the wikisource mailing list keep being used as a tool to empower its members and learn from each other, it will gain traction up to the level that keeps its activity over time independently of our participation, which is what we hope.

It is critical that we communicate to the users effectively about how to use these “invisible” tools and, if possible, extend the trust and reliance on each other that is already happening at a small scale, to a larger extended family of deeply involved Wikisourcerors.

Yet to improve
One of the pending issues that we have is extending the familiarity links between communities that have no links, or that had no opportunity to develop them. We believe that a way of improving this would be to advocate for involved users of each community to travel and meet at Wikimania 2014. The health and diversity in our community only can be reached if we do an extra effort and take care of big and small communities that have not been involved participating at international level, maybe for this lack of trust and for not knowing the people who are on the other side. As the Wikisource vision unfolds, there are more and more topics to talk about, and to learn from each other. We discovered fun activities, like celebrating contest or nagging for advertisement, and that knowlege can be shared with each other to improve upon.
What didn’t work
In the midpoint report, we explained in detail which were the major challenges for us. Those remained, even if, truth to be told, the Survey and the Contest helped a lot gathering people around and getting things done.

Still:

  • lack of communication between Wikisource language communities is still one of the major problem. This is intertwined with the fact that global Wkisource userbase is small, especially compared to the Wikipedia one. This is not an easy issue to solve.

Things are slowly changing, we'll see how much.

  • lack of communication between cross-project communities is again something not easy to fix, but Wikidata is helping in that way. But, at least in the Catalan and Italian case, Wikipedia communities agreed to use the sitenotice to promote the proofreading contest, which led to a huge increase or readers and participants. Things like this can help boost Wikisource events, and we hope also collaboration and trust between projects.

Next steps and opportunities edit

This projects accomplished some of its purposes, as setting up a team of dedicated Wikisourcerors, and make them start collaborating. Now there is a formal Wikisource Community User Group, there is an active mailing list, there is a date for the integration with Wikidata, there are data from which start and develop a strategy.
  • A new IEG could take everything from here, and directly work for major improvements, backed up by the results of the survey.

We steped up as "promoters" (meaning: those who signed the contract) of the Wikisource Community UG, and could facilitate the conversations about the direction of the group.

Other things are:

  • Organize community meet up for next Wikimania, bring at least one internationally-involved member from each community
  • Help to coordinate the migration of 20th-century based Wikisource-tech into 21st-century Wikidata tech. Identify weaknesses, help build new processes
  • Involve the community with the interconnection with external libraries, start an effort to add identifiers to book editions that help to cross-match with other important systems (Internet Archive, Open Library, etc)
  • Refurbish the Wikipedia "Template:Infobox book", so it can transition to use the new data model, work with Wikipedia book community to find the best suited solution.
  • Support the Wikidata endeavor of centralizing Wikipedia’s bibliographic information
  • Find organizations that wish to tackle any of the biggest identified problems by the community
  • Keep proposing GsoC projects, find mentors

Grantee reflection edit

Aubrey: I very much enjoyed doing this project, because I was perfectly aware that it was needed.

I deeply believe in Wikisource, and in his potentiality to be the best digital library in the world. I do want to make it happen. Wikisource it's just in the initial phase of his life: we can accomplish much, much more. This is why a project like this had to happen: we need community coordination and collaboration, we need software development and maintenance, and we need to be recognized and engaged by the wider Wikimedia (and Wikipedia) community. Many things in this IEG project could have been done differently (and probably better), but I am satisfied by the results we accomplished. Probably the best thing, for me, has been to meet Micru: we didn't know each other before starting this, and it was a big leap of faith to begin a funded project with a "stranger". We found out very soon that our competences and skills were complementary, and I enjoyed very much working with him. We collaborated almost seamlessly, and this is not trivial, nor granted.

We just expressed our opinion on the "WMF Seal of Approval", and I confirm that, especially in the first part of the project, it was very useful. But in the second part, we had the Wikisource Community User Group, and I think that that was even better. We had a group that was collective, and other people felt more comfortable collaborating within that group (at least, this was my feeling). I find this to be a delicate but valuable balance: being able to serve the community for what the community doesn't want to do, for the dirty job (being crunching the data of the survey, or just drafting it, or writing a blogpost every month, or go with the bureaucracy of creating a User group).

Finally, I really enjoyed the fact that the WMF helped us with the Qualtrics software. And it struck me as a very good way to help the wider community: provide sofwtare and tools for evaluation, surveys, analytics, whatever. I feel that this could be scaled further: single users or chapters can't afford softwares or tools like that, or they don't even know the existence of it. Maybe the WMF could make them available for more people and more widely: for example, if they bought a version of ABBYY Finereader OCR for enterprise (we should check the prices, of course) and integrate it within Wikisource, that could be very appreciated (OCR software request was one outcome of the Wikisource survey).

Micru: I share Aubrey's satisfaction about how effective and pleasant the teamwork between us two was. From not knowing each other, we started to work together and to communicate almost every day, and I'm really thrilled about how well it went. I also want to thank Tpt for all the support that he offered us with his technical knowledge and his readiness to evaluate new ideas. And of course, if this project advanced at all, it was because the community at large reacted positively to this initiative.

As many others, before of this grant I was never involved in any similar international effort. It was a great revelation to step out of those constraints and to collaborate at a larger scale than that I was used to. I also found my own answers to questions like "how is that this so simple hasn't been fixed yet?", as for things that look easy on the surface are in fact complex problems that require not only technical solutions, but also social approaches to build consensus, and to foster discussion up to a point where adequate action might be triggered. The grant was great for being committed to a well-defined goal over a short span of time, but of course some of the problems being addressed require of longer time-scales.

We not only see Wikisource as a continuation of what it is today, it might take a larger role on the ecosystem of existing free knowledge harbors around the world, if the community is up to the task of self-improvement and of communicating effectively how Wikisource efforts contribute to build the sum of all human knowledge. And there are other challenges too, like attracting diversity, and giving a voice to under-represented groups that share a common goal with us.

I appreciate the support given by the IEG team, from the volunteers that evaluated our initial application and helped us to shape it, passing by the logistic, economic and emotional support, since from our relative frame of reference it was not always clear if we were advancing in the right direction. That Siko was always there with helpful advice is something that we appreciate greatly, and we hope that both WMF involved people and stakeholders are satisfied with what was accomplished thanks to this grant, which we are aware that it is complex to evaluate.

Web fonts and language selector disabled edit

The Universal Language Selector, a piece of software that controls the display of fonts, has been partially disabled across Wikimedia projects because there were complaints that it caused pages to load slowly. It can be re-enabled by registered users via their preferences but is unavailable to unregistered users. Some text will not display properly as a result, the {{blackletter}} template no longer functions due to this, and special Dyslexic-friendly fonts cannot be used. The software will be re-enabled as default once the page-loading problem is resolved.

Featured text for February 2014 edit

 
Illustration of East Indiamen (1720) from The Clipper Ship Era.

February's featured text is "The Clipper Ship Era", a 1911 history book by former sea captain Arthur Hamilton Clark.

Clark's history represents the pinnacle of wooden sailing ships in the mid-nineteenth century (1843–1869). The period is illustrated through "biographies" of some of the greatest ships of the day and accounts of their races across oceans to get their cargo, from tea to opium, to market.

The Clipper Ship Era began in 1843 as a result of the growing demand for a more rapid delivery of tea from China; continued under the stimulating influence of the discovery of gold in California and Australia in 1849 and 1851, and ended with the opening of the Suez Canal in 1869. These memorable years form one of the most important and interesting periods of maritime history. They stand between the centuries during which man navigated the sea with sail and oar—a slave to unknown winds and currents, helpless alike in calm and in storm—and the successful introduction of steam navigation, by which man has obtained mastery upon the ocean.

After countless generations of evolution, this era witnessed the highest development of the wooden sailing ship in construction, speed, and beauty. Nearly all the clipper ships made records which were not equalled by the steamships of their day; and more than a quarter of a century elapsed, devoted to discovery and invention in perfecting the marine engine and boiler, before the best clipper ship records for speed were broken by steam vessels. During this era, too, important discoveries were made in regard to the laws governing the winds and currents of the ocean; and this knowledge, together with improvements in model and rig, enabled sailing ships to reduce by forty days the average time formerly required for the outward and homeward voyage from England and America to Australia.

Collaborations for February 2014 edit

After a very successful Proofread of the Month in January, during which 39 pamphlets from Victorian Britain of lengths ranging from 4 to 150 pages were completed, we turn our focus to fiction. The first work is Virginia Woolf's novel The Voyage Out (1920 edition). This, the first of Woolf's nine novels, uses the literary devices that were to make her a leading novelist of the 20th century. Its setting on a boat going out to South America gave her the opportunity to satirise Edwardian life and mores.

Author page connection with Wikidata items, which was last month's Maintenance of the Month task, is running on in February. Wikidata gives us the possibility to connect our pages to their items and, from February 25 onward, to use data in our pages. Author pages could use Wikidata to obtain the author's given name, surname, birth and death years, description, alias(es), and image, as well as several authority control identifiers.

Two administrators were confirmed in January 2014:

Two administrators are having their confirmation discussions in February 2014:

One user was appointed as an administrator in January 2014:

One user was appointed as a checkuser in January 2014:

Milestones edit

Over 1–2 January, the Portuguese Wikisource deleted about three quarters of its pages of text, reducing its total to just 26,180 text units (content pages, or pages of text in the mainspace). This was a cleaning up exercise made necessary by two things. The first is Dictionary Candido de Figueiredo (1913), a Brazilian Portuguese dictionary, which is in scope of Wikisource but was both incomplete and flawed, while also representing over 64,000 pages of text and ten categories. The second is a collection of fragments from early use of the the Proofread Page extension. All of these pages may be restored in time.

The Malayalam Wikisource reached several milestones in January. It reached the 5,000 text unit mark on 3 January and then the 10,000 text unit mark on 28 January, adding 5,000 pages in a month. 17th January also marked the project's 100,000th page edit.

On 13 January, the Japanese Wikisource reached 10,000 total pages.

On 31 January, English Wikisource reached 150,000 validated pages. This occurred 13 months after reaching 100,000, which equates to an average of over 120 pages per day during that time.

Notes edit

  • Wikimania 2014, which will be held on 6-10 August 2014 in London, United Kingdom, is now accepting submissions for presentations and requests for scholarships to pay for attendance.
  • A modification to the main page is being discussed, the new version currently features a split between fiction and non-fiction in new texts, rearranges the page and may include a new poem of the day element.
  • The Wikimedia Foundation is considering allowing the use of the non-free MP4 video format on its projects. A request for comment is in progress on Wikimedia Commons.