Welcome to Wikisource

Hello, Stout256, and welcome to Wikisource! Thank you for joining the project. I hope you like the place and decide to stay. Here are a few good links for newcomers:

 

You may be interested in participating in

Add the code {{active projects}}, {{PotM}} or {{Collaboration/MC}} to your page for current Wikisource projects.

You can put a brief description of your interests on your user page and contributions to another Wikimedia project, such as Wikipedia and Commons.

Have questions? Then please ask them at either

I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes (~~~~); this will automatically produce your username if you're logged in (or IP address if you are not) and the date. If you need help, ask me on my talk page, or ask your question here (click edit) and place {{helpme}} before your question.

Again, welcome! Beeswaxcandle (talk) 05:24, 16 April 2021 (UTC)Reply

Art of War

edit

Hi, I just wanted to thank you for your work on The Art of War, picking up where I left off and – by the looks of things — doing a massive amount of transcription work and correcting a bunch of my mistakes. I think you're doing a great job!

I'm curious about how our transcription processes compare, because the diacritic and Hanzi-laden scans didn't make it easy for me.

I've been been back at work on the text myself, marking pages that used to have missing characters as needing validation. Once I've cleared those, I'll move on to the pages that still need transcribing fully. That is, of course, assuming there are any left by the time I'm done! --EnronEvolved (talk) 22:12, 6 October 2021 (UTC)Reply

I used ABBYY Finereader 15 to OCR a PDF text I downloaded from the Internet Archive and did a quick visual proofread. I then copied one paragraph at a time from the Wikisource text and used my editor (Notepad++) to compare it to my independently transcribed text. If there were any differences, I referred back to the PDF to get the proper value and corrected the texts. Of course this method fails if both transcribed texts have the same error.
Whenever I saw a word missing a diacritical mark (usually a Chinese name), I would do a find and replace of my entire text. Hanzi was a bit tougher. Finereader did a good job, but I still had to do a visual compare. If I saw a problem I could usually just type in the proper character using the Microsoft Chinese IME, but sometimes I needed to look them up online by radical/phonetic.
I've finished transcribing the entire text and completed a cursory proofread. I'll add all those pages to Wikisource this week. They will need another proofread... Stout256 (talk) 11:33, 7 October 2021 (UTC)Reply
Oh, cool. I've just been working from the scans uploaded to Wikisource. I learned how to use Compose key sequences pretty quickly, to type all the diacritics, but as I'm not familiar with Han characters, particularly in the book's typeface, I just let them be. Suzukaze-c pounced on every page I marked with chinese-missing templates, so I'm sure they could pitch in to proofread the Hanzi. That being said, any tips on understanding the sort of typeface the text is using would be helpful. EnronEvolved (talk) 20:37, 7 October 2021 (UTC)Reply