Speech processing: Difference between revisions
m (→LPC & PSOLA) |
mNo edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{ | {{#addbodyclass:tag_ling}} | ||
{{ling}} | |||
===Plots and visualisations=== | ===Plots and visualisations=== | ||
Line 12: | Line 11: | ||
====Spectrogram==== | ====Spectrogram==== | ||
<!-- | <!-- | ||
A spectrogram in general is a plot of frequencies over time. | |||
Line 17: | Line 17: | ||
* only shows 0..5kHz because there's almost nothing interesting to speech above that, and zooming in means we can see the pitch movement better | * only shows 0..5kHz because there's almost nothing interesting to speech above that, and zooming in means we can see the pitch movement better | ||
* has a mild lowpass to put most of the formants on similar-looking strength visible | * has a mild lowpass to put most of the formants on similar-looking strength visible | ||
* tries to always show curves regardless of volume, by adapting to the maximum volume present | * tries to always show curves regardless of volume, by adapting to the maximum volume present | ||
* tries to | * tries to hide noise by showing only the top so-many decibels below that maximum | ||
* applies dynamic comprssion to try to smooth over amplitude variation in your speech | * applies dynamic comprssion to try to smooth over amplitude variation in your speech | ||
Line 25: | Line 24: | ||
--> | --> | ||
====Intonogram==== | ====Intonogram==== |
Latest revision as of 16:13, 29 April 2024
Plots and visualisations
Oscillogram
Waveform view.
Spectrogram
Intonogram
An intonograph seems to sometimes point at a device used for speech analysis (a little more specific than e.g. abusing a visicorder), and the plots it made are called intonograms.
...but most things called intonograms seem to be prints of computer analyses.
Most of them will have an estimation of fundamental frequency of speech.
Other things they may show on the same plot tends to include the waveform, and may include intensity, and e.g. time markers for manual annotation.
It seems to now indicate any sort of plot that shows a combination of information,
so e.g. praat's Sound view (and perhaps Manipulation view) would probably qualify.
Simple modelling of speech
source-filter model
The source-filter model names the model/attitude that we can get a good approximation of speech with
- either a tone at the fundamental pitch (for vowels) or noise (for consonants)
- a few filters to imitate the formants
https://en.wikipedia.org/wiki/Source%E2%80%93filter_model