Transcription, transliteration: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{#addbodyclass:tag_ling}}
{{stub}}
{{stub}}


Line 18: Line 19:


Note that transliteration can lead to a lot of false omissions in document search processes that are not aware of the phonetic nature of the transliterations, as transliteration systems are usually not directly comparable to each other, or the writing system they came from. Thorough search may need to reverse-guess such mappings, although this is often nontrivial.
Note that transliteration can lead to a lot of false omissions in document search processes that are not aware of the phonetic nature of the transliterations, as transliteration systems are usually not directly comparable to each other, or the writing system they came from. Thorough search may need to reverse-guess such mappings, although this is often nontrivial.
-->
===Forced alignment===
<!--
Forced alignment, or force-aligned transcription,
means automatically aligning audio files with transcripts.
In other words, "Assume this was what was spoken, fit the audio to this"
[[Prosody#Intro|Prosody]] research
Forced alignment can also speed up transcription
* automatic speech recognition
* human verification of that text
* forced alignment to match it to the audio again
-->
-->


Line 25: Line 51:




Transcription tries to be phonetically accurate, but often strives to be simple to understand and apply too, so phonetic accuracy may suffer.
Transcription often strives to be simple to understand and apply in the target language, so phonetic accuracy is often, while a concern, may be secondary.
 


Transliteration is often done to make writing in another writing system more readable and/or more pronounceable via one's own, or to assist learning when the alphabet is deemed hard to understand or learn. It is usually somewhat inaccurate.
It often seems done to make it easier to learn.


Japanese has 23 phonetic syllables, which are easier to learn than the 46 symbols of hiragana and katakana, and certainly much simpler than the thousands of ideograms, that all map to the same sounds somehow.  
For example, japanese has 23 phonetic syllables, which are easier to learn than the 46 symbols of hiragana and katakana, and certainly much simpler than the thousands of ideograms, that all map to the same sounds somehow.  
With transliteration, this can be learned separately, and over time.
With transliteration, the sounds can be learned somewhat separately from the 46 symbols.


As a [[syllabery]], Japanese is strictly phonetic and the syllables are easily Romanized.
Alphabets like Cyrillic are quite phonetic so easily Romanized too.


Romanization is transliteration into the latin alphabet.
Japanese is strictly phonetic and the syllables so lends itself well,
and even alphabets like Cyrillic are quite phonetic so Romanized well enough.


Converting Latin-alphabet words to systems like Japanese and Russian is much harder, mostly because not all sounds have obvious counterparts. For example, Japanese has the problem that the only thing that can be written must adhere to its syllables, so you have to insert sounds and/or use a similar consonant to get to a close syllable.
Converting Latin-alphabet words to systems like Japanese and Russian is harder, mostly because not all sounds have obvious counterparts.
For example, Japanese has the problem that the only thing that can be written must adhere to its syllables, so you have to insert sounds and/or use a similar consonant to get to a close syllable.


Other problems include the fact that you're basing phonetic conversion on written characters may mess up [[digraph]] cases and such.
Other problems include the fact that you're basing phonetic conversion on written characters may mess up [[digraph]] cases and such.
Line 43: Line 70:


===Transliteration as an input method===
===Transliteration as an input method===
One use is input methods on computers.
One use is input methods on computers.
Languages like Japanese, Chinese, Russian and many others can be typed phonetically with Latin characters, which usually works out as a pronunciation source for respective characters, as you type. This allows western keyboards to be used for these languages.
Languages like Japanese, Chinese, Russian and many others can be typed phonetically with Latin characters, which usually works out as a pronunciation source for respective characters, as you type. This allows western keyboards to be used for these languages.

Latest revision as of 16:14, 29 April 2024

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

(Outside of linguistics, these are confused with some regularity)


Transcription

Transcribing usually means taking sound and writing its contents down.

Regularly to that same language's writing system.

In linguistics, particularly in the context of phonetics or dialectology, it mat transcribes to a phonetic script, such as IPA.



Forced alignment

Transliteration

Transliteration is the process of transcribing from one writing system to another.


Transcription often strives to be simple to understand and apply in the target language, so phonetic accuracy is often, while a concern, may be secondary.

It often seems done to make it easier to learn.

For example, japanese has 23 phonetic syllables, which are easier to learn than the 46 symbols of hiragana and katakana, and certainly much simpler than the thousands of ideograms, that all map to the same sounds somehow. With transliteration, the sounds can be learned somewhat separately from the 46 symbols.


Romanization is transliteration into the latin alphabet. Japanese is strictly phonetic and the syllables so lends itself well, and even alphabets like Cyrillic are quite phonetic so Romanized well enough.

Converting Latin-alphabet words to systems like Japanese and Russian is harder, mostly because not all sounds have obvious counterparts. For example, Japanese has the problem that the only thing that can be written must adhere to its syllables, so you have to insert sounds and/or use a similar consonant to get to a close syllable.

Other problems include the fact that you're basing phonetic conversion on written characters may mess up digraph cases and such.


Transliteration as an input method

One use is input methods on computers. Languages like Japanese, Chinese, Russian and many others can be typed phonetically with Latin characters, which usually works out as a pronunciation source for respective characters, as you type. This allows western keyboards to be used for these languages.


It will require you to only use characters that will convert, and may force you to choose between alternatives. Neither is a practical problem.

Overlap

Transliteration and transcription are easily confused.

Romanization can indicate transcription as well as transliteration, because it only indicates the target alphabet - in this case only the source makes a difference. When you want to write Russian and Chinese in Latin characters, you would transcribe Chinese and transliterate Russian.

You need to convert Chinese via its pronunciation; the logographs are themselves not phonetic.

Starting with Russian, a phonetic alphabet, you can transliterate it into English and into various other languages (slightly differently, since you keep pronunciation and writing system in mind), while writing it as IPA would be transcription.

-->