Electronics project notes/Audio notes - Digital sound communication: Difference between revisions
m (→I2S) |
m (→I2S) |
||
(34 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{#addbodyclass:tag_tech}} | |||
{{#addbodyclass:tag_media}} | |||
{{avnotes}} | {{avnotes}} | ||
{{stub}} | {{stub}} | ||
Line 5: | Line 7: | ||
=Typically external= | =Typically external= | ||
==S/PDIF== | ==AES3 and S/PDIF== | ||
{{stub}} | |||
. | Serial, one-directional, digital audio data. | ||
'''AES3''' is a digital audio protocol from 1985{{verify}} seemingly aimed at communicating 44.1 kHz, 48 kHz, and also 32kHz (e.g. from existing formats like [[CD audio]], [[DAT]], and some other things, and potentially from digital-only devices). | |||
It was co-developed by AES and EBU, and at the time was often marked 'AES/EBU' on devices. | |||
You can treat AES/EBU as meaning AES3. | |||
'''S/PDIF''' ("Sony/Philips Digital Interface") | |||
is based on AES3, and could perhaps be seen as a consumer variant of AES3, simplifying the way it would be implemented and appear to consumers{{verify}}. | |||
Also, later expansions primarily apply to S/PDIF. | |||
From a practical standpoint, you now mostly care about S/PDIF, until you work with some older devices. | |||
But also, a lot of streams will be valid to both{{verify}} | |||
IEC 60958 {{comment|(a.k.a. IEC958 before IEC's renumbering in 1998)}} | |||
seems to have absorbed both, which makes historical distinctions and incompatibilities | |||
a little more interesting to figure out for newcomers. | |||
This also confuses the connector part. Say, IEC60958 now defines: | |||
: Over '''XLR3''' (IEC 60958 Type I), balanced | |||
: Over '''RCA''' (IEC 60958 Type II), unbalanced | |||
: Over '''TOSLINK''' (IEC 60958 Type III) | |||
: Over '''BNC''', unbalanced, was used in broadcasting (probably in part because it could be used over existing BNC/Coax) | |||
Notes: | |||
* Broadly speaking | |||
: XLR3 and BNC are likely to be AES3 pedigree | |||
: TOSLINK and BNC are likely to be S/PDIF flavour. | |||
:: (though there was a time at which it could also be AES3.{{verify}}) | |||
* Even though XLR is balanced, it was still only meant for shorter distances; BNC was intended for somewhat longer runs {{verify}} | |||
* Expansions of the standard also added additional formats, that earlier hardware would not know | |||
: Raw [[PCM]] is the basic form, but S/PDIF devices may also understand | |||
:: '''[[DTS]]''' for 5.1/7.1 -- more specifically the [[DTS Coherent Acoustics]] (DCA) codec | |||
::'''[[AC3]]''' (Dolby Surround) | |||
: will be loosely corrected with connector, in part just because newer devices use TOSLINK, not XLR/BNC | |||
Compatibility notes: | |||
* Even if the plugs/covnerters allow it, it is ''not'' recommended to plug AES3 directly into S/PDIF without further thought. | |||
: Conversion at electrical level is not necessarily hard, and some devices are engineered to accept both, but ''don't count on it'' | |||
<!-- | <!-- | ||
* both S/PDIF and AES3 transfer 24 bit words, ''but'' | |||
:: in AES3, the last 4 bits are reserved and not usable for audio | |||
:: in S/PDIF those bits are unspecified | |||
:: which is why the spec only guarantees 20 bits, but you can put audio in those four bits. | |||
:: devices may send in 16-bit (even if they work in 24-bit) | |||
Compatibility notes: | |||
* Alterations to the ''protocol'' are minor, and in particular sending just stereo PCM is largely compatible. | |||
* The subcode data is different between AES3 and S/PDIF ''but'' in practice not a lot of devices send those. | |||
: so you can sometimes get away with AES3 to S/PDIF | |||
* Some later devices made changes that earlier devices may not understand | |||
: e.g. 24-bit TOSLINK | |||
: this also matters to converters | |||
* these protocols can be run at rates other than 48000, 44100 and 32000, but don't count on devices supporting that{{verify}} | |||
* AES3 XLR3 is not electrically compatible with AES3 BNC | |||
* AES3 should not be ''assumed'' to be electrically compatible with S/PDIF copper -- | |||
:: but: can AES3 BNC be connected to S/PDIF copper?{{verify}}{{verify}} | |||
* CDROM drives might output S/PDIF data -- but often at 5V (which out of spec of direct connections, but fine internally) {{verify}} | |||
--> | --> | ||
<!-- | <!-- | ||
AES-EBU balanced is differential with an up-to-10V voltage swing -- in fact sort of similar to [[RS-422]] and you ''can'' use RS-422 interconnects. | |||
S/PDIF on copper is 0.5..1V | |||
http://lampizator.eu/lampizator/transport/spdif.html | http://lampizator.eu/lampizator/transport/spdif.html | ||
Line 57: | Line 102: | ||
<!-- | <!-- | ||
AES3 1985 (revised in 1992, 2003) | |||
IEC 60958 exists in five parts: | |||
* IEC 60958-1: General [https://webstore.iec.ch/publication/71031] | |||
:: linear PCM up to 24bit | |||
:: 1999, 2004, 2008, 2014, 2021 | |||
* IEC 60958-2: Software Information Delivery Mode | |||
:: | |||
* IEC 60958-3: Consumer applications | |||
:: 1999, 2003, 2006, 2021 {{verify}} | |||
* IEC 60958-4 (in three parts): Professional applications | |||
:: "wider range of physical media", more sampling frequencies, deprecation of "minimum implementation" of channel status data. | |||
:: 2016 | |||
* IEC 60958-5: Consumer application enhancement | |||
:: multichannel, multi-stream, high-resolution, multimedia extension | |||
:: 2021 | |||
https://webstore.iec.ch/publication/62827 | |||
: | |||
: | |||
Related: | |||
* EIAJ CP-340 1987-9 seems to be equivalent to IEC958-3:1989 ? | |||
* '''AES3id''' (a.k.a. AES-3id-1995, AES-75, AES-BNC) is specifically the unbalanced 75-ohm coax, | |||
which is compatible with S/PDIF copper, but itself ''typically'' implemented over [[BNC]] connectors. | |||
AES3id has a taste of being developed to carry audio over existing video coax on longer distances (order of 100m). | |||
Converters between the balanced and AES3id forms exist. | |||
IEC 61937 - sends things other than PCM[https://webstore.iec.ch/publication/6141], apparently AC-3, MPEG-1 (Layer 1 & 2), MPEG-3 (Layer 3), MPEG-2(multichannel), MPEG-2/4 AAC in ADTS, DTS, Dolby Digital Plus[https://learn.microsoft.com/en-us/windows/win32/coreaudio/representing-formats-for-iec-61937-transmissions] | |||
There is a spec of how to carry it over network with low latency | There is a spec of how to carry it over network with low latency | ||
--> | --> | ||
See also: | |||
* http://en.wikipedia.org/wiki/S/PDIF | |||
* http://www.lampizator.eu/LAMPIZATOR/TRANSPORT/spdif.html | |||
Not to be confused with | |||
* AES/EBU (next section) | |||
==ADAT== | ==ADAT== | ||
Line 87: | Line 171: | ||
It carries audio channels that are always 24 bit {{comment|(devices that are 16-bit will effectively just use the 16 highest bits)}}. | It carries audio channels that are always 24 bit {{comment|(devices that are internally 16-bit will effectively just use the 16 highest bits)}}. | ||
Its speed lets it carry | Its speed lets it carry | ||
Line 107: | Line 191: | ||
{{comment|(Note: no technical relation to [[I2C|I<sub>2</sub>C]])}} | {{comment|(Note: no technical relation to [[I2C|I<sub>2</sub>C]])}} | ||
I<sub>2</sub>S (sometmimes IIS), '''I'''nter-'''I'''C '''S'''ound, is meant as an easy and standard way to transfer [[PCM]] data between closeby chips. | (has existed since the mid-eighties) | ||
I<sub>2</sub>S (often I2S, sometmimes IIS), '''I'''nter-'''I'''C '''S'''ound, is meant as an easy and standard way to transfer [[PCM]] data between closeby chips. | |||
It separates clock and data, so it can have slightly lower jitter (and indirectly latency) than buses that don't. | It separates clock and data, so it can have slightly lower jitter (and indirectly latency) than buses that don't. | ||
Line 113: | Line 199: | ||
I<sub>2</sub>S doesn't spec a plug, or how to deal with longer distances (impedance and such) | |||
As such, it is mostly used within devices. | |||
Exceptions mainly being audiophile setups that want to choose their DACs separately. | Exceptions mainly being audiophile setups that want to choose their DACs separately. {{comment|as I2C wasn't quite made for that this comes with a few footnotes - impedance details can cause synchronization issues, particularly at higher bitrates}} | ||
Line 127: | Line 213: | ||
* ground | * ground | ||
BCLK pulses for each bit, so should be | BCLK pulses for each bit, so should be ''sample_rate * bit_depth * channel_amount'', e.g. 1411200 Hz for CD audio (44100*16*2). | ||
LRCLK selects left/right channel | LRCLK selects left/right channel | ||
Line 137: | Line 223: | ||
* The protocol is fundamentally 2-channel (in part due to LRCLK's function) | * The protocol is fundamentally 2-channel (in part due to LRCLK's function) | ||
: If you ''functionally'' want to send mono, you could send zero on the other. | : If you ''functionally'' want to send mono, you could send zero on the other. | ||
:: but if you have that sample anyway, then it makes just as much sense to output it twice, i.e. in both channels, so that if a receiver decides to implement mono by picking one channel, it doesn't matter which one | :: but if you have that sample anyway, then it makes just as much sense to output it twice, i.e. in both channels, so that | ||
::: if a receiver decides to implement mono by picking one channel, it doesn't matter which one | |||
::: stereo playback will be double mono rather than seeming to miss one channel. | |||
* Sample rate is not configured, it is implicit from the sending speed{{verify}}, | * Sample rate is not configured, it is implicit from the sending speed{{verify}}, | ||
Line 146: | Line 234: | ||
See also: | |||
* https://hackaday.com/2019/04/18/all-you-need-to-know-about-i2s/ | |||
* https://web.archive.org/web/20140223115501/http://www.eng.auburn.edu/~nelson/courses/elec5260_6260/Inter-IC%20Sound%20%28I2S%29%20Bus2.pdf | |||
* https://web.archive.org/web/20080706121949/http://www.nxp.com/acrobat_download/various/I2SBUS.pdf | |||
* https://en.wikipedia.org/wiki/I²S | |||
Because channels=2 | |||
===Abusing I2S in DIY=== | |||
Because I2S needs to go fast, support often means it is its own peripheral, and probably DMA-assisted. | |||
This means it has found other data-sending uses. | |||
Because | |||
: channels=2 | |||
: sample rate is controlled by the clock, and | |||
: bit depth is somewhat implied, | |||
you can vary some aspects of what it sends without negotiating it. | you can vary some aspects of what it sends without negotiating it. | ||
For example, when feeding in data into an I2S DAC, you ''do'' need to do the stereo interlacing as in the spec, and the bit depth as the DAC expects, but it ''doesn't'' need know the sample rate - it will do what you ask of it, at the rate you ask it to. | For example, when feeding in data into an I2S DAC, you ''do'' need to do the stereo interlacing as in the spec, and the bit depth as the DAC expects, but it ''doesn't'' need know the sample rate - it will do what you ask of it, at the rate you ask it to. | ||
Line 161: | Line 261: | ||
For example, the [[ESP8266]] and [[ESP32]]'s I2S is actually run from a more generic piece of hardware, roughly a glorified [[shift register]], used to implement I2S as well as LCD and camera peripherals. | For example, the [[ESP8266]] and [[ESP32]]'s I2S is actually run from a more generic piece of hardware, roughly a glorified [[shift register]], used to implement I2S as well as LCD and camera peripherals. | ||
It happens to go at ~1.4MHz for audio | It happens to go at ~1.4MHz for audio, but if you can control the output rate, then you can produce other sorts of signals, and DIYers have found it's fairly stable at 40MHz, which makes it possible to produce NTSC and VGA signals, and could even sample data at that rate. | ||
Line 168: | Line 268: | ||
You could probably send PDM over these - which would be an ironic use of something already intended for audio, but which might makes sense if the receiving side ''isn't'' an I2S DAC{{verify}}. | You could probably send PDM over these - which would be an ironic use of something already intended for audio, but which might makes sense if the receiving side ''isn't'' an I2S DAC{{verify}}. | ||
<!-- | <!-- | ||
Apparently there are some mild dialects of I2S | Apparently there are some mild dialects of I2S | ||
...which you can deal with via relatively simple code-and-config additions. | ...which you can deal with via relatively simple code-and-config additions. | ||
Line 179: | Line 276: | ||
https://github.com/esp8266/Arduino/issues/6940 | https://github.com/esp8266/Arduino/issues/6940 | ||
https://github.com/esp8266/Arduino/issues/4571 | https://github.com/esp8266/Arduino/issues/4571 | ||
Chips with native I2S support | Chips with native I2S support | ||
* RP2040 (DMA'd) {{verify}} | * RP2040 (DMA'd) {{verify}} | ||
* ESP8266 and ESP32 (DMA'd) {{verify}} | * ESP8266 and ESP32 (DMA'd) {{verify}} | ||
--> | |||
==MEMS== | |||
<!-- | |||
* | MEMS microphones seem to often either speak | ||
* [[I2S]] | |||
* a [[PDM signal]] which you can | |||
** read out digitally | |||
** ''or'' (with some lowpassing) treat as an analog signal | |||
--> | |||
==On DACs== | ==On DACs== | ||
<!-- | <!-- | ||
Depending e.g. on purpose, DACs may speak: | |||
* [[SPI]], | |||
* [[I2S]], | |||
* [[I2C]](/[[two-wire]]) | |||
* custom things like the floating point numbers most [[OPL]]s spoke | |||
* Parallel bits (e.g. DAC0800) | |||
--> | --> |
Latest revision as of 12:21, 8 May 2024
This is mostly about hardware interconnects. For software media routing, see Local and network media routing notes
Typically external
AES3 and S/PDIF
Serial, one-directional, digital audio data.
AES3 is a digital audio protocol from 1985(verify) seemingly aimed at communicating 44.1 kHz, 48 kHz, and also 32kHz (e.g. from existing formats like CD audio, DAT, and some other things, and potentially from digital-only devices).
It was co-developed by AES and EBU, and at the time was often marked 'AES/EBU' on devices. You can treat AES/EBU as meaning AES3.
S/PDIF ("Sony/Philips Digital Interface")
is based on AES3, and could perhaps be seen as a consumer variant of AES3, simplifying the way it would be implemented and appear to consumers(verify).
Also, later expansions primarily apply to S/PDIF.
From a practical standpoint, you now mostly care about S/PDIF, until you work with some older devices.
But also, a lot of streams will be valid to both(verify)
IEC 60958 (a.k.a. IEC958 before IEC's renumbering in 1998)
seems to have absorbed both, which makes historical distinctions and incompatibilities
a little more interesting to figure out for newcomers.
This also confuses the connector part. Say, IEC60958 now defines:
- Over XLR3 (IEC 60958 Type I), balanced
- Over RCA (IEC 60958 Type II), unbalanced
- Over TOSLINK (IEC 60958 Type III)
- Over BNC, unbalanced, was used in broadcasting (probably in part because it could be used over existing BNC/Coax)
Notes:
- Broadly speaking
- XLR3 and BNC are likely to be AES3 pedigree
- TOSLINK and BNC are likely to be S/PDIF flavour.
- (though there was a time at which it could also be AES3.(verify))
- Even though XLR is balanced, it was still only meant for shorter distances; BNC was intended for somewhat longer runs (verify)
- Expansions of the standard also added additional formats, that earlier hardware would not know
- Raw PCM is the basic form, but S/PDIF devices may also understand
- DTS for 5.1/7.1 -- more specifically the DTS Coherent Acoustics (DCA) codec
- AC3 (Dolby Surround)
- will be loosely corrected with connector, in part just because newer devices use TOSLINK, not XLR/BNC
Compatibility notes:
- Even if the plugs/covnerters allow it, it is not recommended to plug AES3 directly into S/PDIF without further thought.
- Conversion at electrical level is not necessarily hard, and some devices are engineered to accept both, but don't count on it
See also:
Not to be confused with
- AES/EBU (next section)
ADAT
ADAT has referred to two distinct things
Historically, and now rarely, to the Alesis Digital Audio Tape, a way of storing eight digital audio tracks onto Super VHS.
These days, so much more typically, it refers to the ADAT Optical Interface, more commonly known as ADAT Lightpipe or often just ADAT (or lightpipe), also from Alesis.
It looks the same as TOSLINK / S/PDIF, but speaks a different protocol, and somewhat faster.
It carries audio channels that are always 24 bit (devices that are internally 16-bit will effectively just use the 16 highest bits).
Its speed lets it carry
- up to eight channels of those at 48kHz.
...or, with the common S/MUX extension
- up to four channels at 96kHz
- up to two channels at 192kHz
See also:
Typically internal
I2S
(Note: no technical relation to I2C)
(has existed since the mid-eighties)
I2S (often I2S, sometmimes IIS), Inter-IC Sound, is meant as an easy and standard way to transfer PCM data between closeby chips.
It separates clock and data, so it can have slightly lower jitter (and indirectly latency) than buses that don't.
I2S doesn't spec a plug, or how to deal with longer distances (impedance and such) As such, it is mostly used within devices.
Exceptions mainly being audiophile setups that want to choose their DACs separately. as I2C wasn't quite made for that this comes with a few footnotes - impedance details can cause synchronization issues, particularly at higher bitrates
Lines and bits and interpretation
The lines are
- bit clock (BCLK) (a.k.a. continuous serial clock (SCK))
- left-right clock (LRCLK) (a.k.a. word clock, word select (WS), Frame sync (FS))
- data
- ground
BCLK pulses for each bit, so should be sample_rate * bit_depth * channel_amount, e.g. 1411200 Hz for CD audio (44100*16*2).
LRCLK selects left/right channel
Some also add a master clock (MCLK). This is not part of standard I2S
Note that:
- The protocol is fundamentally 2-channel (in part due to LRCLK's function)
- If you functionally want to send mono, you could send zero on the other.
- but if you have that sample anyway, then it makes just as much sense to output it twice, i.e. in both channels, so that
- if a receiver decides to implement mono by picking one channel, it doesn't matter which one
- stereo playback will be double mono rather than seeming to miss one channel.
- but if you have that sample anyway, then it makes just as much sense to output it twice, i.e. in both channels, so that
- Sample rate is not configured, it is implicit from the sending speed(verify),
- which is part of why software bit-banging I2S would probably never sound great
- Bit depth is implied by when LRCLK switches (which it can do because the MSB goes first)
- with some work left to the receiver
See also:
- https://web.archive.org/web/20140223115501/http://www.eng.auburn.edu/~nelson/courses/elec5260_6260/Inter-IC%20Sound%20%28I2S%29%20Bus2.pdf
- https://web.archive.org/web/20080706121949/http://www.nxp.com/acrobat_download/various/I2SBUS.pdf
Abusing I2S in DIY
Because I2S needs to go fast, support often means it is its own peripheral, and probably DMA-assisted. This means it has found other data-sending uses.
Because
- channels=2
- sample rate is controlled by the clock, and
- bit depth is somewhat implied,
you can vary some aspects of what it sends without negotiating it.
For example, when feeding in data into an I2S DAC, you do need to do the stereo interlacing as in the spec, and the bit depth as the DAC expects, but it doesn't need know the sample rate - it will do what you ask of it, at the rate you ask it to.
For example, the ESP8266 and ESP32's I2S is actually run from a more generic piece of hardware, roughly a glorified shift register, used to implement I2S as well as LCD and camera peripherals.
It happens to go at ~1.4MHz for audio, but if you can control the output rate, then you can produce other sorts of signals, and DIYers have found it's fairly stable at 40MHz, which makes it possible to produce NTSC and VGA signals, and could even sample data at that rate.
Similarly, RP2040 has a Programmable I/O (PIO)[1] [2] [3]
You could probably send PDM over these - which would be an ironic use of something already intended for audio, but which might makes sense if the receiving side isn't an I2S DAC(verify).