The physical and human spects dealing with audio, video, and images
Vision and color perception: objectively describing color · the eyes and the brain · physics, numbers, and (non)linearity · color spaces · references, links, and unsorted stuff
Image: file formats
· noise reduction
· halftoning, dithering
· illuminant correction
· Image descriptors
· Reverse image search
· image feature and contour detection
· OCR
· Image - unsorted
Displays:
· On display speed
· Screen tearing and vsync
· Arguments for 60fps / 60Hz in gaming
· Video display notes
·· Before framebuffers · Simpler display types · Display DIY
Video: file format notes · video encoding notes ·
- Subtitle format notes
Audio physics and physiology: Sound physics and some human psychoacoustics · Descriptions used for sound and music · Sound level meter notes
Digital sound and processing:
capture, storage, reproduction · on APIs (and latency) · programming and codecs · some glossary · Audio and signal processing - unsorted stuff
Music electronics:
device voltage and impedance, audio and otherwise ·
amps and speakers ·
basic audio hacks ·
Simple ADCs and DACs ·
digital audio ·
multichannel and surround
On the stage side: microphones ·
audio levels & technical gritty ·
devices you'll use ·
cables, connectors, adapters ·
Effects · sync
Noise stuff: Stray signals and noise · sound-related noise names · electronic non-coupled noise names · electronic coupled noise · ground loop · strategies to avoid coupled noise · Sampling, reproduction, and transmission distortions · (tape) noise reduction
Electronic music:
- Electronic music - musical and technical terms
- MIDI ·
- Some history, ways of making noises
- Gaming synth ·
- VCO, LFO, DCO, DDS notes
- microcontroller synth
- Modular synth (eurorack, mostly):
- sync · power supply · formats (physical, interconnects)
- DIY
- physical
- Electrical components, small building blocks
- Learning from existing devices
- Electronic music - modular - DIY
- DAW: Ableton notes · MuLab notes · Mainstage notes
Unsorted: Visuals DIY · Signal analysis, modeling, processing (some audio, some more generic) · Music fingerprinting and identification
For more, see Category:Audio, video, images
|
Speech analysis and processing
Source-filter model
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
There is a theory that you can model speech with a relatively simple model:
- two sound sources:
- a pitched waveform (often sawtooth), to imitate the vocal tract's voicing
- noise, to imitate fricatives and consonants
- (bandpass) filters to imitate the resonance that we call formants
Things like voders, vocoders and LPC and PSOLA follow this idea,
even if the implementations vary
https://en.wikipedia.org/wiki/Source%E2%80%93filter_model
Vocoders
Vocoders (Voice coders) will analyses speech into certain parameters,
then synthesizes based on those parameters.
They were once made with the idea to parametrize speech for efficient transmission of voice calls.
It was an important development in telecom, and also used in phonetics and in music.
It was also potentially an instrument of sorts.
A voder is the production part only,
and sometimes is even a playable machine[1].
These days vocoders are mostly used to distort vocals and instruments in music.
See also:
Linear predictive coding (LPC) for speech; and PSOLA
STRAIGHT
Semi-sorted
Performance metrics
SNR
THD and THD+N
SINAD
IMD
Other metrics
Unsorted