Electronics project notes/Audio notes - Digital sound communication

From Helpful
(Redirected from SPDI/F)
Jump to navigation Jump to search

The physical and human spects dealing with audio, video, and images

Vision and color perception: objectively describing color · the eyes and the brain · physics, numbers, and (non)linearity · color spaces · references, links, and unsorted stuff

Image: file formats · noise reduction · halftoning, dithering · illuminant correction · Image descriptors · Reverse image search · image feature and contour detection · OCR · Image - unsorted

Video: file format notes · video encoding notes · On display speed · Screen tearing and vsync

Simpler display types · Video display notes · Display DIY
Subtitle format notes


Audio physics and physiology: Sound physics and some human psychoacoustics · Descriptions used for sound and music

Noise stuff: Stray signals and noise · sound-related noise names · electronic non-coupled noise names · electronic coupled noise · ground loop · strategies to avoid coupled noise · Sampling, reproduction, and transmission distortions · (tape) noise reduction


Digital sound and processing: capture, storage, reproduction · on APIs (and latency) · programming and codecs · some glossary · Audio and signal processing - unsorted stuff

Music electronics: device voltage and impedance, audio and otherwise · amps and speakers · basic audio hacks · Simple ADCs and DACs · digital audio · multichannel and surround
On the stage side: microphones · studio and stage notes · Effects · sync


Electronic music:

Electronic music - musical terms
MIDI · Some history, ways of making noises · Gaming synth · microcontroller synth
Modular synth (eurorack, mostly):
sync · power supply · formats (physical, interconnects)
DAW: Ableton notes · MuLab notes · Mainstage notes


Unsorted: Visuals DIY · Signal analysis, modeling, processing (some audio, some more generic) · Music fingerprinting and identification

For more, see Category:Audio, video, images

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

This is mostly about hardware interconnects. For software media routing, see Local and network media routing notes


Typically external

AES3 and S/PDIF

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Serial, one-directional, digital audio data.


AES3 is a digital audio protocol from 1985(verify) seemingly aimed at communicating 44.1 kHz, 48 kHz, and also 32kHz (e.g. from existing formats like CD audio, DAT, and some other things, and potentially from digital-only devices).

It was co-developed by AES and EBU, and at the time was often marked 'AES/EBU' on devices. You can treat AES/EBU as meaning AES3.


S/PDIF ("Sony/Philips Digital Interface") is based on AES3, and could perhaps be seen as a consumer variant of AES3, simplifying the way it would be implemented, and easier for consumers use(verify).

Also, later expansions on the format primarily apply to S/PDIF.


From a practical standpoint, you now mostly care about S/PDIF, until you work with some older devices.

But also, a lot of streams will be valid to both(verify)


IEC 60958 (a.k.a. IEC958 before IEC's renumbering in 1998) seems to have absorbed both, which makes historical distinctions and incompatibilities a little more interesting to figure out.

This also confuses the connector part. Say, IEC60958 now defines:

Over XLR3 (IEC 60958 Type I), balanced
Over RCA (IEC 60958 Type II), unbalanced
Over TOSLINK (IEC 60958 Type III)
Over BNC, unbalanced, was used in broadcasting (presumably in part because it could be used over existing BNC/Coax)

Notes:

  • Broadly speaking
XLR3 and BNC are likely to be AES3 pedigree
TOSLINK and RCA are likely to be S/PDIF flavour.
(though there was a time at which it could also be AES3.(verify))
  • Even though XLR is balanced, it was still only meant for shorter distances; BNC was intended for somewhat longer runs (verify)
  • Expansions of the standard also added additional formats, that earlier hardware would not know
Raw PCM is the basic form, but S/PDIF devices may also understand
DTS for 5.1/7.1 -- more specifically the DTS Coherent Acoustics (DCA) codec
AC3 (Dolby Surround)
will be loosely corrected with connector, in part just because newer devices use TOSLINK, not XLR/BNC


Compatibility notes:

  • Even if the plugs/converters allow it, it is not recommended to plug AES3 directly into S/PDIF without further thought.
Conversion at electrical level is not necessarily hard, and some devices are engineered to accept both, but don't count on it


See also:


ADAT

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

ADAT has referred to two distinct things


Historically, and now rarely, to the Alesis Digital Audio Tape, a way of storing eight digital audio tracks onto Super VHS.


These days, so much more typically, it refers to the ADAT Optical Interface, more commonly known as ADAT Lightpipe or often just ADAT (or lightpipe), also from Alesis.

It looks the same as TOSLINK / S/PDIF, but speaks a different protocol, and somewhat faster.


It carries audio channels that are always 24 bit (devices that are internally 16-bit will effectively just use the 16 highest bits).

Its speed lets it carry

up to eight channels of those at 48kHz.


...or, with the common S/MUX extension

up to four channels at 96kHz
up to two channels at 192kHz


See also:

Typically internal

I2S

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

(Note: no technical relation to I2C)

I2S (often I2S, sometmimes IIS), Inter-IC Sound, is meant as an easy and standard way to transfer PCM data between closeby chips, and has existed since the mid-eighties.

It separates clock and data - the bit clock is essentially there to give it a regular clock. In theory this could be recovered from the data, but not without some jitter, so we can have lower jitter (and indirectly lower latency) than audio buses that don't.



I2S doesn't spec a plug, or how to deal with longer distances (impedance and such). As such, it is mostly used within devices. (with a few exceptions such as audiophile setups that want to choose their DACs separately. As I2C wasn't quite made for that this comes with a few footnotes - impedance details can cause synchronization issues, particularly at higher bitrates. Which amuses me because these are the kind of thing audiophiles are trying to solve)


Lines and bits and interpretation

The lines are

  • data - a stream of bits
  • ground - we need a reference
  • bit clock (BCLK) (a.k.a. continuous serial clock (SCK))
BCLK pulses for each bit, so should be sample_rate * bit_depth * channel_amount, e.g. {{{1}}}1411200 Hz for CD audio.
  • left-right clock (LRCLK), a.k.a. word clock, word select (WS), Frame sync (FS)
LRCLK selects left/right channel (essentially interleaved in time).


Some also add a master clock (MCLK).

This is not part of standard I2S(verify), and comes with some of its own notes.

Note that:

  • The protocol is fundamentally 2-channel (in part due to LRCLK's function)
If you functionally want to send mono, you could send zero on the other.
but if you have that sample anyway, then it makes just as much sense to output it twice, i.e. in both channels, so that
if a receiver decides to implement mono by picking one channel, it doesn't matter which one
stereo playback will be double mono rather than seeming to miss one channel
  • Sample rate is not configured, it is implicit from the sending speed(verify),
which is part of why software bit-banging I2S would probably never sound great
  • Bit depth is implied by when LRCLK switches (which it can do because the MSB goes first)
with some work left to the receiver


See also:



Abusing I2S in DIY

Because I2S needs to go fast, support often means it is its own peripheral, and probably DMA-assisted. This means it has found other data-sending uses.

Because

channels=2
sample rate is controlled by the clock, and
bit depth is somewhat implied,

you can vary some aspects of what it sends without negotiating it.

For example, when feeding in data into an I2S DAC, you do need to do the stereo interlacing as in the spec, and the bit depth as the DAC expects, but it doesn't need know the sample rate - it will do what you ask of it, at the rate you ask it to.


For example, the ESP8266 and ESP32's I2S is actually run from a more generic piece of hardware, roughly a glorified shift register, used to implement I2S as well as LCD and camera peripherals.

It happens to go at ~1.4MHz for audio, but if you can control the output rate, then you can produce other sorts of signals, and DIYers have found it's fairly stable at 40MHz, which makes it possible to produce NTSC and VGA signals, and could even sample data at that rate.


Similarly, RP2040 has a Programmable I/O (PIO)[1] [2] [3]


You could probably send PDM over these - which would be an ironic use of something already intended for audio, but which might makes sense if the receiving side isn't an I2S DAC(verify).


MEMS

On DACs