Electronic music - notes on audio APIs
Why latency exists (the long version)
Latency and physics
Latency in the real world exists because of distance and the speed of sound.
The speed of sound is 343 meters per second.
Which is roughly 3 milliseconds per meter, so purely physically:
- a mic on a guitar cab has maybe 1ms to physically get sound from speaker to mic.
- talking to someone closeby is easily 5ms
- a small-to-moderate music practice space easily has maybe 15ms of delay from one wall to the other
- opposite ends of a 15m bus would be 40ms
- halfway across a sports field is easily 100ms
So distance alone is
- why bands may well watch their drummers
- why in larger spaces you may want to use headphones instead (but not bluetooth ones)
- one of a few reasons orchestras have conductors
Some other context:
- two frames in 60fps game are 17ms apart
- two frames in 24fps movie are 42ms apart
In musical context
Latency in hardware
The nature of digital audio
Why larger buffers are generally useful
When smaller buffers are required
Input, output, round-trip, software? On reported and actual latency
Measuring latency
On drivers and APIs
Windows APIs
APIs in terms of latency
APIs that exist
Some API history
On ASIO wrappers
ASIO is an API that is completely separate from windows sound APIs.
- 'Doing it entirely our own way because it's a known quantity that way' was a good deal of the point at the time
- Speaking ASIO to an ASIO driver that directly controls its hardware might be called native ASIO.
- choose to expose only the ASIO API
- ...or, sometimes, choose to expose both ASIO and a windows API -- more flexible in theory ('you can just use it in windows'), though can also somewhat more confusing in use and ends up with a set of maybe-true, maybe-superstitions, such as
- "it's better only when windows or some other software isn't also trying to use it"
- or "don't set it as the windows out or it will refuse to open as ASIO"
- or "don't set it as the windows out or windows out will stop when you open it as ASIO"
- or something else.
Given that context, you would think that "I speak ASIO" means "I talk to specific hardware and speak ASIO to you", and that's mostly true.
ASIO wrappers speak ASIO, but have a different goal.
ASIO wrappers open a regular (non-ASIO) sound card via a regular Windows sound API calls (in practice typically via WDM/KS or Core Audio's WASAPI), force some specific settings (e.g. smaller buffers, and exclusive mode where possible), then present that via the ASIO API.
Is this counter to ASIO's shortest-path-to-the-hardware principle? Yes.
Will it get you latencies better than that the underlying API could give you anyway? No.
Is there still good reason to do it? Actually yes.
"If you could get these latencies from the underlying API anyway, why add a layer?"
Convenience, mostly.
- it is easier for you to figure out the precise settings (small-buffer, possibly-exclusive) once,
- in one place - the wrapper's settings (doesn't change with how you use it), rather than for every DAW-soundcard combination you have
- that maybe you can save and restore later
- there is a clean separation between "speaks ASIO" and "does various latency-lowering tricks"
- in a way that anything that speaks ASIO can benefit from equally
- using that wrapper might make explanations a lot easier.
- Particularly towards people who just care for 'make it work decently' than reading up on decades of idiosyncratic programing history. (Per specific DAW where applicable)
- In fact some DAWs speak mainly or only ASIO, because their approach is to tell you to either get ASIO hardware, or if not, figure out low latency in something external, and talk to that.
- case in point: FL Studio, Cubase, MAGIX supply ASIO wrappers
There's a few more useful reasons hiding in the details, like
- you can often force WASAPI cards down to maybe 5-10ms output latency without exclusive mode
- ...which means you don't have to dedicate a sound card to a DAW that only talks ASIO.
- Which is good enough e.g. for when playing some piano on a laptop on the go, so pretty convenient
- some ASIO wrappers can talk to on different sound cards for input and output, at the cost of slightly higher latency (will probably glitch at the lowest latencies), which DAWs talking native ASIO will typically refuse to do (for latency and glitch reasons).
As far as I can tell
- ASIO4ALLv2 is a WDM/KS wrapper
- needs to force exclusive mode
- can talk to different sound cards for input and output
- FL Studio ASIO (a.k.a. FLASIO) is a WASAPI wrapper
- Comes with FL studio (including its demo), and is perfectly usable in other DAWs
- can talk to different sound cards for input and output
- "Generic Low Latency ASIO Driver" is similar to ASIO4ALL but with different options
- Comes with Cubase
- MAGIX ASIO - (verify)
- ASIO2KS [1]
- ASIO2WASAPI [2]
- FlexASIO [3]
- ASIO Link
- more focused on allowing more complex routing
Some of them are multiclient - allow multiple ASIO clients to connect to the same ASIO device (at the cost of a little more latency than if you stay exclusive)
A few cases are only that, e.g. Steinberg's multiclient driver[4] seems nothing beyond ASIO in ASIO.
There is further software that happens to present its result as ASIO, sometimes more for options than for latency reasons
- VOICEMEETER also presents its result as ASIO
On custom ASIO drivers
In theory, you can write an ASIO driver for any hardware.
It won't do you too much good if the hardware isn't designed for low latency, It won't be one installed automatically, and these days there may be less reason to do so (in that you can get similar latency with Core Audio/WASAPI).
...but e.g. the Kx project, that made a specific series of Sound Blaster cards more capable, also included ASIO support, and would let the cards do on the order of 5ms.
...and I have a handheld recorder that, while fairly cheap, happens to also act as a good low-latency ASIO sound card - when the drivers are installed, anyway.
Some audio interfaces can speak both a regular API and ASIO.
It's more accomodating, but also potentially more confusing, because you cannot talk ASIO while the regular.
And if windows decides it's first in the list of cards, that means you need to poke settings before it does what you think it should.
"So which API is best?"
"Why are some things called loopback?"
Linux APIs
Kernel level
Higher level
OSX APIs
Because OSX was a relatively clean slate at the time, at lower levels there is primarily Core Audio (also present in iOS, slimmed down(verify)), which has been quite capable for a long while.
https://developer.apple.com/audio/
Lowering latency
In general
tl;dr
- zero latency does not exist, a few milliseconds of relative offsets happens all over the place
- amounts of added latency can matter, and can be made low
- latency matters when hearing yourself live, or syncing to something live (e.g. looper pedals)
- digital input, output, and/or processing have some latency
- In ways that are (looking at forums) usually partly misunderstood