Language Codes

Code

HTML code

xml:lang="fr"
xml:lang="en-ca"
etc.

<i xml:lang="fr">Bonjour</i>
<span xml:lang="de">Gutentag</span>

Script shortcut

The script will switch out the properly formatted code for shortcuts formatted like this in classes:
lang:fr
lang:en-ca

<i class="lang:fr">Bonjour</i>
<span class="lang:de">Guten tag</span>

Codes and Subtags

The languages codes are referred to as subtags.

Tag explanations

A great article: https://www.w3.org/International/articles/language-tags/

Language Tags can have several components: Major languages are represented by a two or three letter code (primary language subtag) e.g. English: en; French: fr; Cree: cr; Albanian: sq; Latin: la (yes, some latin should be tagged) etc.

There can also be an extended subtag region attached e.g. Canadian French: fr-CA vs Luxembourg French: fr-LU Some region tags: https://www.andiamo.co.uk/resources/iso-language-codes/

Alternatively there can be an extended subtag for scripts e.g. Serbian in Cyrillic script: sr-Cyrl vs Serbian in Latin script: sr-Latn (Note: you would normally not use the Latn subtag in a book that was mainly in latin script)

Examples of language tags including extlang subtags are:
zh-yue (Cantonese Chinese)
ar-afb (Gulf Arabic)

Made up languages

Can be tagged:
Klingon: tlh;
Esperanto: eo;

If it is a one-off made up langage it is preceded by an x: x-arcturan

Non-linguistic

Use the subtag zxx when the text is known to be not in any language.
xml:lang="zxx"

<p>Here is a list of part numbers: <span xml:lang="zxx">9RUI34 8XOS12 3TYY85</span>.</p>

Undetermined

xml:lang="und" However you should only tag text as undetermined if you can't just leave it as is. In practice, this means you should only use this markup if the undetermined text is embedded in content that has already been labeled for language in some way. e.g.

<i xml:lang="fr">Mon frére est tout <span xml:lang="und">kazplumed</span>.</i>

Common European languages

  • French: fr
  • German: de
  • Italian: it
  • Spanish: es
  • Japanese: ja
  • Yiddish: yi
  • Russian : ru

Indigenous languages

A listing of some North American indigenous languages.

Be sure to use Tool below to check if there are any sub languages e.g. sal (Salishan) has both slh (southern Puget Sound Salish) and str (Straits Salish)

West Coast Languages

https://en.wikipedia.org/wiki/Salishan_languages

  • Chinook jargon (Chinuk Wawa): chn
  • Gwichya Gwich’in: gwi
  • Haida: hai
  • Hul’q’umi’num’: hur
  • Kutenai (Kootenai): kut
  • Kwakʼwala: kwk
  • Salishan languages (Salish): sal
    • Southern Puget Sound Salish: slh
    • Straits Salish: str
  • Squamish squ
  • Tlingit: tli
  • Tsimshian: tsi
    • includes Sm’algya̱x
  • Wakashan languages: wak

Other North American languages

  • Athapascan languages: ath
  • Blackfoot/Siksika: bla
  • Chipewyan: chp
  • Cree: cre
    • Plains Cree: crk
  • Dakota (Sioux): dak
  • Delaware (Munsee): del
  • Dogrib (Tli Cho): dgr
  • Inuktitut: iu
  • Inupiaq: ik
  • Lakota: lkt
  • Michif: crg
  • Micmac; Mi'kmaq: mic
  • Mohawk: moh
  • Nēhiyawēwin: crk
  • Ojibwa (Anishinaabemowin): oj
  • Paiute: pao
  • Slave (Athapascan): den
  • Shoshone (Eastern): shh

— source: https://www.nationsonline.org/oneworld/language_code.htm

Spellings

  • you can represent Kwaka̱ka̱wakw as Kwaka&#817;ka&#817;wakw or maybe Kwakwa̱ka̱ʼwakw