Language codes, country codes

From Helpful
Jump to: navigation, search


  • ISO 639 is the set of international standards that lists short codes for language names:
    • ISO 639-1 (~136 two-letter codes for most common languages)
    • ISO 639-2 (~464 three-letter codes, including language groups)(verify)
    • ISO 639-3 (~7600 three-letter codes, no groups)

primary language code is two- or three letters codes, taking from one of:
ISO 639-1 (two-letter)
ISO 639-2 (three-letter)
ISO 639-3 (three-letter)
ISO 639-5 (three-letter(verify))
with optional subtags for variants for specific countries, regions, or writing systems
but many contexts say to stay general unless you intend to be narrow.
is Japanese.
refers to Japanese only as spoken in Japan, which doesn't really contrast with much, whereas a distinction between fr-BE, fr-CA, fr-CH, fr-FR, fr-LU, fr-MC might be useful but still only mostly to linguists
localization may well use these, because you tend to care both about language, and about formatting of numbers and times which tend to be country-specific habits
takes from ISO 639, ISO 15924 (for scripts), ISO 3166 (for countries), and UN M49, but with its own rules around changes (verify)

  • The Ethnologue codes correlate strongly with ISO 639-3. (Note that Ethnologue has a cross-reference of what languages are spoken in what countries, see [1])
  • Language codes used in MARC (a library metadata standard) strongly correlate with ISO 639-2. (see [2])


For example, Belgium is BE, BEL, and 056 respectively
  • UN M.49 are area codes used by the UN that defines geographical, political, and economic regions (more stable than countries)

Writing systems


ISO 4217 is a list of three letter currency codes typically consisting of the ISO 3166-1 alpha-2 country code and the initial of the currency

For example USD, JPY.

Currencies used in multiple countries generally start with an X. EUR is an exception.).

A few currencies are not listed, often because they are not independent currencies, usually because they are local currencies pinned to another currency. In some cases, these may have codes used for them which are not listed in the ISO document.

It is the norm in banking, and common in some other contexts like airlines and other international tickets, exchange rates listed in newspapers, etc. (verify)


Note that there are some cases where the ISO 639-1 language code and the ISO 3166-1 alpha-2 country code is the same.

Sometimes this indicates the country that the language is mostly spoken in, sometimes they are completely unrelated.

This has led to some confusion, and people using codes in the wrong contexts.

Some of the confusion probably also comes from codes being based on either exonyms and endonyms(verify).

Endonyms and exonyms

For context: Names for geographical place, group of people, individual person, language, and dialect may well get changed, sometimes minor sound changes or transliteration, and sometimes they are unrelated, or even originating in mistakes, and often with complex history (as names often have), shifting reference over the centuries, 'accurate enough from a distance' descriptions sticking around, and other interesting etymology.

In such a context, endonym is the native/local name, exonym any of ten often-many foreign names.

For example, what English calls Japan (and most other languages have a variant on that), Japan itself officially calls Nippon / Nihon (日本).

The origins of 'japan' aren't entirely certain, but very likely come from outside japan - via another language or two.

Suomi can (now) refer both to Finland and the modern Finnish language.

'suomi' itself seems to originally be a loanword, with not entirely certain etymology
The Finns and Finnic group of refers to various ethnic groups (among other related ones) who happened to live in what we now call Finland ...and in what for ease we'll call Sweden, Norway, and Russia.
so it seems to have just come from 'the place where many of the Fins live'

...just like Germany is where many of the Germanic people lived (even though the name for the area was different), and even then the English name for the area wasn't that(verify).

See also