Wikisource:Portal classification system adaptation

Portal classification system adaptation

A guide to the adaptation of the Library of Congress Classification system (LCCS) to Wikisource, for use in the Portal namespace, and a subsequent log of changes.

This page describes the method by which the Library of Congress Classification system (LCCS) was adapted to Wikisource and the subsequent changes to that system.

In addition to casual interest, this page can be used to:

  • Explain the differences between the two systems and solve related problems with the classification of works.
  • Solve any problems that may occur when cross referencing the original and adapted systems.
  • Provide a basis for future versions of the system, if necessary, or for incidental changes.
  • Provide a blueprint if it is necessary to revert or repair parts of the system.

Original

edit

The outline of the Library of Congress Classification system can be read in full on Wikisource at Library of Congress Classification; which includes a search function to find topics and their classifications. It can also be read at the Library of Congress's website. More information can be found via the Wikipedia article.

The LCCS system uses 21 classes, each represented by a letter of the alphabet. Each class is broken down into subclasses, represented by one or two further letters of the alphabet. More specific classification then uses a number from 1 to 999, which is itself followed by a cutter number.

Adaptation

edit

Background

edit

Wikisource's portal space was not initially organised in any way. Most indices were, at that time, in the Wikisource namespace, and also not organised (leading to duplication in some cases). It was desired that any subject index for Wikisource, which is essentially the purpose of portal space on the project, should use a published and authoritative system rather than something homegrown. This also has the advantage of being complete and having been already tested in practice.

The Dewey Decimal system qualifies under these terms but it is largely under copyright (the original version is in the public domain but it has gaps where reference to the modern world would be needed; adding these will either cause confusion or risk copyright infringement). LCCS is a work of the United States government and therefore in the public domain.

As this classification system is for use with portals and not individual books, the level of detail in the original system in unnecessary. So, the numeric portions of the call numbers were dropped, leaving just the subclass.

For example, the official Library of Congress call number for On the origin of species by means of natural selection by Charles Darwin is QH365.O2. On Wikisource, any portals corresponding to this book, and those like it, would be classified by just the subclass, the initial alphabetic portion of this call number, or QH.

Further, as explained below, during implementing the system, it needed to be adapted to fit Wikisource's specific needs. Two more classes (I & X) and several more subclasses were added, while one class (E) was slightly reinterpreted to fit pre-existing material. This is justified as an existing practice when adapting the LCC system to local or specific use. Wikipedia notes that "The National Library of Medicine classification system (NLM) uses the classification scheme's unused letters W and QS–QZ. Some libraries use NLM in conjunction with LCC, eschewing LCC's R (Medicine). Others prefer to use the LCC scheme's QP-QR schedules and include Medicine R."

Split subclasses

edit
UPDATE: In a later re-examination of this particular example, subclass PT, it was obvious that all of these languages were actually in the Germanic language family. So they could all be conflated together as simply "Germanic literature" without the need for any subclasses. This was done to simplify the list of subclasses in Class P and remove minor subclasses that were unlikely to be used. The point stands, however, that some other official subclasses needed to be split to meet Wikisource's needs.

The subclasses as they exist are not all directly usable as portals on Wikisource. The most extreme example is Subclass PT: German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature. The directly equivalent portal, Portal:German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature, is impractical for use on Wikisource. Therefore, where this problem occurs, the Library of Congress Classification system subclasses have been divided into new subclasses. The first term of each retains the old classification; subsequent terms add a third letter to create the new classification.

For example, Subclass PT: German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature becomes:

  • Subclass PT: German literature
  • Subclass PTA: Dutch literature
  • Subclass PTB: Flemish literature
  • Subclass PTC: Afrikaans literature
  • Subclass PTD: Scandinavian Literature
  • Subclass PTE: Old Norse Literature
  • Subclass PTF: Modern Icelandic Literature
  • Subclass PTG: Faroese Literature
  • Subclass PTH: Danish Literature
  • Subclass PTI: Norwegian Literature
  • Subclass PTJ: Swedish Literature

Note: Subclass AC is an exception to this pattern. There was an existing Collective works index when this system was implemented, so this was used as the first term instead of Collections, which became the second term.

This is more complicated within Class K, Law, which is explained separately (below).

Some classes contain subclasses with the same code as the class itself. For example, subclass P (Philology and Linguistics) within Class P (Language and Literature). These are represented by a non-alphabet symbol as the second letter of the classification. The examples of this here use an asterisk; however, this causes problems in practice as the wikicode interprets this as a bullet point in some cases. Any symbol or number can be used in its place instead (for example, a hyphen).

New Subclasses
Original classification New classification
Code Title Code Title
AC Collections. Series. Collected works ACA Collections
ACB Series
AG Dictionaries and other general reference works AGA Reference Works
AM Museums. Collectors and collecting AMA Collectors and Collecting
AY Yearbooks. Almanacs. Directories AYA Almanacs
AYB Directories
AZ History of scholarship and learning. The humanities AZA The Humanities
BL Religions. Mythology. Rationalism BLA Mythology
BLB Rationalism
BP Islam. Bahaism. Theosophy, etc. BPA Bahaism
BPB Theosophy
CD Diplomatics. Archives. Seals CDA Archives
CDB Seals
DB Austria - Liechtenstein - Hungary - Czechoslovakia DBA History of Liechtenstein
DBB History of Hungary
DBC History of Czechoslovakia
DC France - Andorra - Monaco DCA History of Andorra
DCB History of Monaco
DG Italy - Malta DGA History of Malta
DK Russia. Soviet Union. Former Soviet Republics - Poland DKA History of the Soviet Union[1]
DKB History of the Former Soviet Republics
DKC History of Poland
DL Northern Europe. Scandinavia DLA History of Scandinavia[1]
DP Spain - Portugal DPA History of Portugal
G* Geography (General). Atlases. Maps G*A Atlases and Maps
GA Mathematical geography. Cartography GAA Cartography
GF Human ecology. Anthropogeography GFA Anthropogeography
HB Economic theory. Demography HBA Demography
HD Industries. Land use. Labor HDA Land Use
HDB Labor
HN Social history and conditions. Social problems. Social reform HNA Social Problems
HNB Social Reform
HQ The family. Marriage. Women HQA Marriage
HQB Women
HT Communities. Classes. Races HTA Classes
HTB Races
HV Social pathology. Social and public welfare. Criminology HVA Social and Public Welfare
HVB Criminology
HX Socialism. Communism. Anarchism HXA Communism
HXB Anarchism
JS Local government. Municipal government JSA Municipal Government
JV Colonies and colonization. Emigration and immigration. International migration JVA Emigration and Immigration
JVB International Migration[1]
PA Greek language and literature. Latin language and literature PAA Latin Language And Literature
PB Modern languages. Celtic languages PBA Celtic Languages
PD Germanic languages. Scandinavian languages PDA Scandinavian Languages
PG Slavic languages and literatures. Baltic languages. Albanian language PGA Baltic Languages
PGB Albanian Language
PH Uralic languages. Basque language PHA Basque Language
PL Languages and literatures of Eastern Asia, Africa, Oceania PLA Languages and Literatures of Africa
PLB Languages and Literatures of Oceania
PM Hyperborean, Indian, and artificial languages PMA Indian Languages
PMB Artificial Languages
PQ French literature - Italian literature - Spanish literature - Portuguese literature PQA Italian Literature[1]
PQB Spanish Literature[1]
PQC Portuguese Literature[1]
PT German literature - Dutch literature - Flemish literature since 1830 - Afrikaans literature -Scandinavian literature - Old Norse literature: Old Icelandic and Old Norwegian - Modern Icelandic literature - Faroese literature - Danish literature - Norwegian literature - Swedish literature PTA Dutch Literature[1]
PTB Flemish Literature[1]
PTC Afrikaans Literature[1]
PTD Scandinavian Literature[1]
PTE Old Norse Literature[1]
PTF Modern Icelandic Literature[1]
PTG Faroese Literature[1]
PTH Danish Literature[1]
PTI Norwegian Literature[1]
PTJ Swedish Literature[1]
RM Therapeutics. Pharmacology RMA Pharmacology
SH Aquaculture. Fisheries. Angling SHA Fisheries[1]
SHB Angling
TC Hydraulic engineering. Ocean engineering TCA Ocean Engineering
TD Environmental technology. Sanitary engineering TDA Sanitary Engineering
TE Highway engineering. Roads and pavements TEA Roads and Pavements[1]
TK Electrical engineering. Electronics. Nuclear engineering TKA Electronics
TKB Nuclear Engineering
TL Motor vehicles. Aeronautics. Astronautics TLA Aeronautics
TLB Astronautics
TN Mining engineering. Metallurgy TNA Metallurgy
UE Cavalry. Armor UEA Armor[1]
UG Military engineering. Air forces UGA Air Forces
VK Navigation. Merchant marine VKA Merchant Marine
VM Naval architecture. Shipbuilding. Marine engineering VMA Shipbuilding[1]
VMB Marine Engineering[1]
Z Books (General). Writing. Paleography. Book industries and trade. Libraries. Bibliography ZB[2] Books
ZC[2] Writing
ZD[2] Paleography
ZE[2] Book Industries and Trade
ZF[2] Libraries
ZG[2] Bibliography
  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19 1.20 These subclasses were later removed after further thought on the matter.
  2. 2.0 2.1 2.2 2.3 2.4 2.5 See Class Z for further information.

Classes E & F: History of the Americas

edit

The Library of Congress Classification system has two classes that cover the history of the Americas. Class E covers the United States while Class F covers the "local history" of the United States in addition to the history of British America, Canada, Dutch America, French America, Latin America and Spanish America.

At the time this classification system was implemented, Wikisource already had Portal:States of the United States with subportals for each state. Therefore, this was used as the equivalent of Class E with little change to the pre-existing portal. Class F was left to cover all other aspects of the history of the Americas, including any aspects of United States history that applies to more than one state.

New classes X & I

edit

During implementation of the system, it was necessary to create two entirely new classes unique to Wikisource. Both use one of the letters omitted from the original classification system.

First, some pre-existing indices on Wikisource did not fit into the Library of congress Classification system. In order to accommodate these, the new Class X ("Wikisource") was created (X being a traditional wildcard term). This class is generally for Wikisource-specific classification. Subclasses are added to Class X as and when a situation arises where one is needed, starting with subclasses for WikiProjects and specific eras (ie. Ancient, Medieval etc)

Second, there was another pre-existing index, Texts by Country (and its subportals and indices), that did not easily fit any class in the system. These portals were national indices that covered each nation in general instead of the LCCS's more specialised areas (history of-, law of-, literature of- etc). Instead of dismantling or severely modifying a functioning index, this was declared to be a new Class I (I was the first unused letter in the alphabet). Each portal in this class serves as a hub for that nation, including works and/or linking to more specialised portals as necessary.

Class K: Law

edit

Class K of the Library of Congress Classification system already makes extensive use of the third letter of the classification, which makes some adaptation (as described above) more difficult. Subclasses could not always be created by adding a letter; some were created by changing the existing third letter to the nearest unused letter. Others required more drastic alterations, changing the second letter of the classification for a batch of subjects and then selecting appropriate third letters from there.

The complete list of subclasses is extensive and can be found at: Portal:Law/Subclasses

New Law Subclasses
Original classification New classification
Code Title Code Title
K Law (general) KA Law (general)
KD Law of the United Kingdom and Ireland KD Law of the United Kingdom and Ireland
KDA Law of the United Kingdom
England KDB Law of England
Wales KDD Law of Wales
KDC Scotland KDC Law of Scotland
KDZ America. North America KDZ Law of North America
Organization of American States (OAS) KDV Organization of American States
Bermuda KDW Law of Bermuda
Greenland KDX Law of Greenland
St. Pierre and Miquelon KDZ Law of St. Pierre and Miquelon
Added KDU History of Law in North America because other continents had similar classifications.
KEN Newfoundland KEJ Law of Newfoundland
Northwest Territories KEK Law of the Northwest Territories
Nova Scotia KEL Law of Nova Scotia
KF Law of the United States KF Law of the United States
Federal law. Common and collective state law KFA Federal Law of the United States
Individual states KFB State Law of the United States
KFA - KFW cover individual states, not enough available classifications
KFZ Northwest Territory KFY Law of the North West Territory of the United States
Confederate States of America KFZ Law of the Confederate States of America
KGH Panama KGH Law of Panama
Panama Canal Zone KGI Law of the Panama Canal Zone
KJ Europe KJ Law of Europe
History of Law KJB History of Law in Europe
Germanic law KJD Germanic Law
KJP Czechoslovakia KJP Law of the Czech Republic
KJQ Law of Slovakia
KJT Finland KJT Law of Finland
France KJU Law of France
KKK Luxembourg KKK Law of Luxembourg
Malta KKO Law of Malta
KLH Georgia (Republic) KLH Law of Georgia (country)
Lithuania, see KKJ KLJ Law of Lithuania
With KKJ and adjacent classifications in use, Lithuania remians here
KLP Ukraine (1919-1991) KLP Law of Ukraine
Zakavkazskaia Sotsialisticheskaia Federativnaia Sovetskaia KLO Law of the Transcaucasian Socialist Federal Soviet Republic
KLR Kazakhstan KLR Law of Kazakhstan
Khorezmskaia Sovetskaia Sotsialisticheskaia Respublika (to 1924) KLU Law of the Khorezm Socialist Soviet Republic
KM Asia KM Law of Asia
Middle East. Southwest Asia KMA Law of the Middle East
KMF Armenia (Republic) KMF Law of Armenia
Bahrain KMB Law of Bahrain
KMG Gaza KMG Law of Palestine
KMM West Bank
KMQ Palestine
KNT-KNU [India] States, cities, etc. KNT State Law of India
KNU Municipal Law of India
KPH States of East and West Malaysia KPH Law of the States of East and West Malaysia
Maldives KPI Law of the Maldives
KQ Africa KQ Law of Africa
History of law KQA History of Law in Africa
KQP British Indian Ocean Territory KQP Law of the British Indian Ocean Territory
British Somaliland KQQ Law of British Somaliland
KSE Equatorial Guinea KSE Law of Equatorial Guinea
Ifni KSF Law of Ifni
KSG Italian East Africa KSG Law of Italian East Africa
Italian Somaliland KSI Law of Italian Somaliland
KSV Mauritius KSV Law of Mauritius
Mayotte KSM Law of Mayotte
KTN Spanish West Africa KTN Law of Spanish West Africa
Spanish Sahara KTO Law of Spanish Sahara
KU Pacific Area KU Law of Oceania
Australia KUA Law of Australia
KUA-KUH States and territories
KUB Law of New South Wales
KUC Law of the Northern Territory
KUD Law of Queensland
KUE Law of South Australia
KUF Law of Tasmania
KUG Law of Victoria
KUH Law of Western Australia
KUI Law of the Ashmore and Cartier Islands
KUJ Law of Christmas Island
KUK Law of the Cocos (Keeling) Islands
KUL Law of the Coral Sea Islands Territory
Added KVA History of Law in Oceania because other continents had similar classifications.
KVH American Samoa KVH Law of American Samoa
British New Guinea (Territory of Papua) KVJ Law of British New Guinea
KVP French Polynesia KVP Law of French Polynesia
German New Guinea KVO Law of German New Guinea
KVS Marshall Islands KVS Law of the Marshall Islands
Micronesia (Federated States) KST Law of Micronesia
Midway Islands KSV Law of the Midway Islands
KVU Nauru KVU Law of Nauru
Netherlands New Guinea KVX Law of Netherlands New Guinea
KWL Pitcairn Island KWL Law of Pitcairn Island
Solomon Islands KWM Law of the Solomon Islands
KWT Wake Island KWT Law of Wake Island
Wallis and Futuna Islands KWU Law of Wallis and Futuna

Some sections from the Law of the Caribbean in subclass KG were moved to the vacant subclass KC due to space limitations.

Moved Law Subclasses
Original classification New classification
Code Title Code Title
KGJ Anguilla KCA Law of Anguilla
KGK Aruba KCB Law of Aruba
KGL Barbados KCC Law of Barbados
Bonaire KCD Law of Bonaire
British Leeward Islands KCE Law of the British Leeward Islands
British Virgin Islands KCF Law of the British Virgin Islands
British West Indies KCG Law of the British West Indies
British Windward Islands KCH Law of the British Windward Islands
KGP Dominica KCJ Law of Dominica
KGR Netherlands Antilles KCK Law of the Netherlands Antilles
Dutch Windward Islands KCL Law of the Dutch Windward Islands
French West Indies KCM Law of the French West Indies
Grenada KCN Law of Grenada
Guadeloupe KCP Law of Guadeloupe
KGT Martinique KCQ Law of Martinique
Montserrat KCR Law of Montserrat
KGW Saint Christopher (Saint Kitts), Nevis, and Anguilla KCS Law of Saint Kitts and Nevis
Saint Lucia KCT Law of Saint Lucia
Saint Vincent and the Grenadines KCU Law of Saint Vincent and the Grenadines
Sint Eustatius KCV Law of Sint Eustatius
Sint Maarten KCW Law of Sint Maarten

Class Z

edit

Update: In the official LCCS, class Z is divided into just two subclass, subclass Z and subclass ZA. Subclass Z covers several different areas: Books (General). Writing. Paleography. Book industries and trade. Libraries. Bibliography. This needs to be split to be used on Wikisource. The first version of this split attempted to preserve the order as seen in the LCCS. The official subclass ZA prevented the second letter of the call number being used, so this was left blank and the third letter was used. For example, "Writing" was split to subclass Z_A. This was unwieldy and awkward, so the second version drops the attempt to preserve the order and moves all of the new subclasses to succeed subclass ZA. For example, "Writing" becomes subclass ZC. The following table shows both versions of this scheme:

Subclass Z
Subject area 1st version codes 2nd version codes
Books Z ZB
Writing Z*A ZC
Paleography Z*B ZD
Book Industries and Trade Z*C ZE
Libraries Z*D ZF
Bibliography Z*E ZG

Subsequent changes

edit

This section is an appendix to the essay. It may be helpful in understanding the adaptation and the classification system if all alterations to it are clearly logged.

  • 18:16, 10 November 2010: Subclass GE changed from "Environmental Sciences" to "Environment" (to match existing portal)
  • 19:25, 11 November 2010: Subclass HTB changed from "Races" to "Race studies" (to match existing portal)
  • 23:27, 12 November 2010: Subclass BL changed from "Religions" to "Religion" (to match existing portal)
  • 22:51, 29 November 2010: Subclass BPA changed from "Bahaism" to "Bahá'í Faith" (to match existing portal)
  • 13:54, 18 January 2011: Subclass KNP changed from "Law of Taiwan" to "Law of the Republic of China‎" (following page move)
  • 12:47, 15 March 2011: Subclass B* changed from "Philosophy (general)" to "General Philosophy" (improving the readability and clarity of the portal title)
  • 12:51, 15 March 2011: Subclass D* changed from "History (general)" to "General History" (improving the readability and clarity of the portal title)
  • 12:53, 15 March 2011: Subclass G* changed from "Geography (general)" to "Geography" (no disambiguation necessary in this case)
  • 12:55, 15 March 2011: Subclass H* changed from "Social Sciences (general)" to "General Social Sciences" (improving the readability and clarity of the portal title)
  • 12:57, 15 March 2011: Subclass JA changed from "Political Science (general)" to "General Political Science" (improving the readability and clarity of the portal title)
  • 13:30, 15 March 2011: Subclass PN changed from "Literature (general)" to "General Literature" (improving the readability and clarity of the portal title)
  • 17:28, 22 March 2011: Subclass KA changed from "Law (general)" to "General Law" (improving the readability and clarity of the portal title)
  • 17:34, 22 March 2011: Subclass L* changed from "Education (general)" to "General Education" (improving the readability and clarity of the portal title)
  • 17:37, 22 March 2011: Subclass M* changed from "Music (general)" to "General Music" (improving the readability and clarity of the portal title)
  • 17:39, 22 March 2011: Subclass Q* changed from "Science (general)" to "General Science" (improving the readability and clarity of the portal title)
  • 17:41, 22 March 2011: Subclass R* changed from "Medicine (general)" to "General Medicine" (improving the readability and clarity of the portal title)
  • 17:42, 22 March 2011: Subclass S* changed from "Agriculture (general)" to "General Agriculture" (improving the readability and clarity of the portal title)
  • 17:44, 22 March 2011: Subclass T* changed from "Technology (general)" to "General Technology" (improving the readability and clarity of the portal title)
  • 17:46, 22 March 2011: Subclass U* changed from "Military Science (general)" to "General Military Science" (improving the readability and clarity of the portal title)
  • 17:48, 22 March 2011: Subclass V* changed from "Naval Science (general)" to "General Naval Science" (improving the readability and clarity of the portal title)
  • 18:29, 24 April 2011: Added subclasses to Class I (to add flexibility to the classification)
  • 21:40, 6 February 2013‎: Removed subclass TEA ("Roads and pavements"), merged content back into subclass TE ("Highway engineering") as per the original LCCS. The difference in content between the two potential portals was slim and confusing; having both separate portals was redundant.
  • 02:01, 9 February 2013‎: Removed subclass JVB ("International migration"), merged into subclass JVA ("Emigration and immigration"). Same reason as above.
  • 20:32, 16 February 2013‎: Removed subclass UEA ("Armor"). Error in original interpretation; this is not distinct enough from UE ("Cavalry").
  • 10:20, 18 February 2013‎: Merged GF (Human ecology) and GFA (Anthropogeography) into GF (Human geography)
  • 17:02, 18 February 2013‎: Collapsing all of the PTx subclasses into one master "Germanic literature" subclass. Too many minor subclasses overloading Class P; on a later look at the list of languages, they were all in the Germanic family making this an obvious choice for merging them all into one simple subclass.
  • 17:06, 18 February 2013‎: Collapsing all of the PQx subclasses into one master "Romanic literature" subclass. Following the example of the previous change.
  • 16:51, 20 February 2013‎‎: Recoded Class Z. See Class Z above.
  • 23:14, 7 August 2013: Collapsed subclasses VM (Shipbuilding), VMA (Naval architecture) and VMB (Marine engineering) back into one master subclass for VM: Shipbuilding and naval architecture
  • 22:24, 20 August 2013‎: Removed subclass DKA ("History of the Soviet Union"), merged into subclass JK ("History of Russia"). As above, the two portals largely cover the same subject.
  • 22:48, 20 August 2013‎: Removed subclass DLA ("History of Scandinavia"), merged into subclass DL ("History of Northern Europe") but kept the name "History of Scandinavia". The other countries of Northern Europe are covered by other subclasses, leaving on Scandinavia anyway; no need for two subclasses.