Digital sample capture, storage, reproduction

From Helpful
(Redirected from Oversampling)
Jump to: navigation, search
The physical and human spects dealing with audio, video, and images

Vision and color perception: objectively describing color · the eyes and the brain · physics, numbers, and (non)linearity · color spaces · references, links, and unsorted stuff

Image: file formats · image processing

Video: format notes · encoding notes · On display speed

Audio physics and physiology: Basic sound physics · Human hearing, psychoacoustics · Descriptions used for sound and music

Digital sound and processing: capture, storage, reproduction · programming and codescs · some glossary · Audio and signal processing - unsorted stuff

Electronic music: Some history, ways of making noises · Gaming synth · on APIs (and latency) ··· microphones · studio and stage notes · Effects · sync ·

Music electronics: device voltage and impedance, audio and otherwise · amps and speakers · basic audio hacks · Simple ADCs and DACs · digital audio · multichannel and surround ·

Noise stuff: Stray signals and noise · sound-related noise names · electronic non-coupled noise names · electronic coupled noise · ground loop · strategies to avoid coupled noise · Sampling, reproduction, and transmission distortions · (tape) noise reduction

Unsorted: Visuals DIY · Signal analysis, modeling, processing (some audio, some more generic) · Music fingerprinting and identification

For more, see Category:Audio, video, images

Continuous reality and discrete digital form

Analog sound is, by the nature of being variation in air pressure, continuous in value and time: there's a value for any given time, and it varies only smoothly - if looked at in close enough detail.

Digital sampling means discreteness in value and time - which means that there can be steps, discontinuities, and such. These terms are significant because the Sampling theorem which says that (and how) we can go between digital and analog, and states under which conditions the process is and isn't lossless, so when the forms are equivalent or not.

Equidistant pressure levels and equidistant sampling interval describes Pulse Code Modulation (PCM), which is used in places like CDs and in uncompressed audio like the WAV format.

PCM is common, largely because it is mathematically convenient.

Digitization has some predictable imperfections -- which you can minimize. Usually noted:

  • the limitation of the dynamic range by quantizing (the pressure dimension, helped by the time one)
  • the possibility for frequencies to alias (the time dimension)

(see following sections)

Note that in digital form (in fact in almost all recorded forms) the amplitude loses real-world meaning. There is no real-world pressure level that it corresponds to. We generally just tweak volume to levels acceptable to us.

When dealing with hardware, decibels refer only to differences in power - ratios more (if amplified) or less (if attenuated). This is the reason that volume indicators (and volume sliders) have a 0dB point way at the top and you don't hear much once you go to perhaps -80dB .

Things like replay gain pretend to explain things in SPL terms, but really work in terms relative to itself - which is the point anyway: making everything play just as loudly as anything else corrected the same way.


Some sampling theory

Practical choices relating to our ears

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


for music reproduction, ~40kHz sample rate is plenty and 16 bits is enough

Sample rate

The physiology of the human cochlea means objectively-equal amplitudes of different frequencies are heard with varying loudness. It starts fallling gently in the 3kHz..15kHz range, then drops sharply. So we hear very little in 16kHz..20kHz, and by 20kHz it has fallen off so much that it's hard to test.

Above 10kHz is already not considered musical (e.g. the highest voice stops at 8kHz, the high violin around 10kHz), but that's about the base tone of these instruments - it's a good idea to capture the overtones within the human-audible range, as we typically interpret that as clarity. This is why 10kHz..16kHz is useful even if it's very subtle to us (and very annoying if loud there).

Historically the aim is to be generous and play safe. While ~40kHz sampling is basically enough to store 20kHz (see Nyquist), while real-world designs of devices like ADCs, DACs and filters (antialiasing or otherwise) require some leeway (and are easier to design with some more).

The number 44.1kHz was chosen to combine easier with TV standards (PAL, NTSC).

48kHz was chosen later, apparently mostly for a bit more leeway for designs. (reasons are unclear to me. It may also relate to TV)

These days, it's also easier to convert to/from 96kHz should you use it.

Actually, the leeway is basically irrelevant to modern designs, because modern ADCs will supersample at least 2x, not because they need higher frequencies but to effectively create this leeway for their own filter.

(Similarly, DACs use oversampling reconstruction[1] to make filtering after it simpler)

Bit depth

Using bits for linearly-spaced amplitudes, combined with the range we typically hear within a cochleal critical band, means we'd want at least 10-12 bits.

It'd e.g. be enough for FM radio reproduction of the now-typical mastering of pop music.

Buuut that little is inflexible at best, and if anyone in the process isn't an expert it'd probably touch the quality.

So 16 bits is enough leeway for most any music reproduction. The number 16 comes from computers, but also happens to give enough range for almost every use.

Higher bit rates are useful only for some very specialized things,

For example, in recording, 24-bits rather than 16-bits doesn't give you a better-quality recording of the same thing - but is practical for the person recording

Gives you more breathing space in terms of levels - it means they don't have to be as perfectly tweaked for your medium. There's more headroom to store transients without distortion (if you know what you're doing), and the noise floor can be further away (if you know what you're doing).


Oversampling input

Oversampling, a.k.a. supersampling, refers to fetching more samples than you strictly need.

Lower sampled noise

One is to smooth sensor readings because you expect a source of noise (the influence of which, when reasonably random, averages closer to zero with repeated reads), to get a slightly stabler output value. You could sample that same temperature sensor 100 times per second and take the average.

Note that this isn't smoothing for real-world reasons you expect it to vary, e.g. taking a temperature average over a minute (which can be sensible for other reasons), but for electronic reasons that are not real world.

Avoid aliasing, and/or easier input filtering

Imitate higher resolution, and lower noise

See also

Oversampling output


Sample storage

Mixing and volume


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Downsampling, upsampling refers to converting a time series to another time series that represents the same freqencies, but as if it were sampled at a different rate.

This can be to integer multiples or integer fractions of the original sample rate, or to arbitrarily other sampling rates.

You typically want to express the same frequency content, as accurate as possible. This makes it a a nontrivial task.

'Point sampled'

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

See also