Transcription, transliteration

From Helpful
Jump to: navigation, search
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

(Outside of linguistics, these are confused with some regularity)


Conversion from sound to script is called transcribing, often writing down spoken form, often for a particular language's writing system.

In linguistics, it often refers to transcribing to phonetic script like IPA.

It is also used to refer to things like romanizing Chinese. Chinese characters are non-phonetic, so they are spoken, then transcribed. This is a process distinct from transliteration.

Note that transliteration can lead to a lot of false omissions in document search processes that are not aware of the phonetic nature of the transliterations, as transliteration systems are usually not directly comparable to each other, or the writing system they came from. Thorough search may need to reverse-guess such mappings, although this is often nontrivial.


Transliteration is the process of transcribing from one writing system to another.

Transcription tries to be phonetically accurate, but often strives to be simple to understand and apply too, so phonetic accuracy may suffer.

Transliteration is often done to make writing in another writing system readable/pronouncable via one's own, or to assist learning when the alphabet is deemed hard to understand or learn. It is usually somewhat inaccurate.

Japanese has 23 phonetic syllables, which are easier to learn than the 46 symbols of hiragana and katakana, and certainly much simpler than the thousands of ideograms, that all map to the same sounds somehow. With transliteration, this can be learned separately, and over time.

As a syllabery, Japanese is strictly phonetic and the syllables are easily Romanized. Alphabets like Cyrillic are quite phonetic so easily Romanized too.

Converting Latin-alphabet words to systems like Japanese and Russian is much harder, mostly because not all sounds have obvious counterparts. For example, Japanese has the problem that the only thing that can be written must adhere to its syllables, so you have to insert sounds and/or use a similar consonant to get to a close syllable.

Other problems include the fact that you're basing phonetic conversion on written characters may mess up digraph cases and such.

Transliteration as an input method

One use is input methods on computers. Languages like Japanese, Chinese, Russian and many others can be typed phonetically with Latin characters, which usually works out as a pronunciation source for respective characters, as you type. This allows western keyboards to be used for these languages.

It will require you to only use characters that will convert, and may force you to choose between alternatives. Neither is a practical problem.


Transliteration and transcription are easily confused.

Romanization can indicate transcription as well as transliteration, because it only indicates the target alphabet - in this case only the source makes a difference. When you want to write Russian and Chinese in Latin characters, you would transcribe Chinese and transliterate Russian.

You need to convert Chinese via its pronunciation; the logographs are themselves not phonetic.

Starting with Russian, a phonetic alphabet, you can transliterate it into English and into various other languages (slightly differently, since you keep pronunciation and writing system in mind), while writing it as IPA would be transcription.