Phonetic scripts


Intonation, stress, focus

Speech processing · Praat notes · Praat plugins and toolkit notes · Praat scripting notes

Phones and phonemes

A phone is any perceptually identifiable sound, without regards to details like whether they are meaningful (some are just filler sounds) whether multiple sounds have the same meaning, ...and such.

The term gets used when talking about sound production in a 'this was exactly the sound produced' or 'these are the sounds we could make' sense - when talking about speech, we quickly move to describe things in terms of phonemes.

A phoneme is a smallest meaningful sound.

...which varies with context, so phonemes are often defined in a contrastive way, e.g. 'if changing a sound changes the meaning of the thing it is part of, then it must belong to a different phoneme'.

You can also think of phonemes as the mental representation - we are in fact pretty bad at describing the actual sound being made, because we automatically figure out the phoneme it belongs to.

When we talk about what accents do, how diphones blur, and such, we often like to talk from phonemes.

Allophones refers to the set of phones that are acoustic variants that have no distinction within a language - or pronunciation system, dialect, accents, or sometimes even context. Roughly: the varied sounds that belong to the same phoneme.

For example, an English letter p is aspirated (has a burst of air) when it is in a syllable onset, while it need not be aspirated elsewhere. Since we do not consider this acoustic difference to mean anything different, the phones [pʰ] and [p] are allophones for the English phoneme /p/.

The variation often comes from convenience of pronunciation, (other) historical changes, and see also complementary distribution.

In transcription

A distinction can be made between phonetic (phones) and phonemic (phonemes) transcription.

It's more a purpose / context thing:

While their definitions aren't precise (varies a little with linguist, and over time), the usual definition is that:

  • Phonetic transcription is concerned with precise description of the sound, regardless of the context or use.
in other words sounds that might ever have distinct meaning.
This is e.g. important in dialect research, where the focus is subtle differences in pronunciations of the same words
arguably more objective, though you can't fully separate "all the sounds humans can make". Things like IPA are pretty good, though.
  • phonemic transcription cares about different sounds only when they have distinct meaning within the context of a specific language (...or dialects of the "replace sounds for the same set of meanings" type. This gets more interesting soon beyond that, though)
For example, when a soft and hard 'th' exists in a language, you transcribe as distinct symbols, if not, just one symbol
...regardless of how different people might pronounce it differently. The point isn't to account for all variants within a language but -- usually -- to have something more convenient to work with for most everyday purposes, like finding out in a dictionary roughly how it's pronounced, and basic comparisons between languages
(much more convenient than having to use a 'full' phone set like IPA)

Transcription at phoneme level would probably omit e.g. the just-mentioned aspiration information (e.g. /pʌf/), transcription at phone level would include it (e.g. [pʰʌf]))

In a lot of linguistics, we work at phonemes level because it's more convenient, and precise enough for most needs. It may only be for things like and dialectography (e.g. detailed dialectometric comparisons), that we may care about detailed phone transcription.

Phonetic coding is regularly set up for both. For example, in IPA, the difference is largely the choice in what detail to transcribe to, mostly how much you use IPA's diacritics.

Native speakers might be faster at phonemic transcription (meaningful sounds), but worse at hearing or transcribing phonetic detail (precise sounds), because resolving sounds to that language's set of phonemes is more automatic for them. Though you can train to hear this, of course.

