CPU cache notes

From Helpful
Jump to navigation Jump to search
The lower-level parts of computers

General: Computer power consumption · Computer noises

Memory: Some understanding of memory hardware · CPU cache · Flash memory · Virtual memory · Memory mapped IO and files · RAM disk · Memory limits on 32-bit and 64-bit machines

Related: Network wiring notes - Power over Ethernet · 19" rack sizes

Unsorted: GPU, GPGPU, OpenCL, CUDA notes · Computer booting

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

CPU caches put a little faster-but-costlier SRAM (or similar) between CPU (registers are even faster) and main RAM (slowish, often DRAM).

CPU caches will mirror fragments of main RAM. Whenever accesses towards main RAM can be served from cache, they are served faster.

Today [Computer_/_Speed_notes order of 1 to 10ns instead of order of 100ns], but the idea has been worth implementing in CPUs since they ran at a dozen MHz or so(verify).

These caches are entirely transparent, in that a user or even programmer should not have to care about how it does its thing, and you could completely ignore their presence, and arguably shouldn't be able to control what it does at all.

As a programmer, you may like a general idea of how they work, because designing for caches in general can help speed for longer.

Optimizing for specific CPU's cache constructions, while possible, is often often barely worth it, and may even prove counterproductive for other CPUs, or even the same brand's a few years later. If you remember just one thing, 'small data is a little likelier to stay in cache', and even that is less true if there are a lot of programs vying for CPU time.

It can also give slightly better spatial locality for individual programs.

Other things, like branch locality can help, but is largely up to the compiler.

A few things, like that arrays have sequential locality that e.g. trees do not, are more down to algorithm choice, but usually out of your hands.

And, in high level reflective OO style languages, you may have little control anyway.

Avoiding caches getting flushed more then necessary help, as can avoiding cache contention - so it helps to know what that is and why it happens. And see when.