Locality of reference

From Helpful
Revision as of 14:43, 8 June 2020 by Helpful (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

In a lot of real-world programs, you can make assumptions about access patterns.

For example, main-memory accesses (or disk accesses) are predictable in different ways, which gives you some cheap optimizations.

Locality of reference is this idea, in the generic sense.

We care about the various more specific ones we can think up.

The simplest to understand is temporal locality, the idea that if a memory location is accessed, chances are decent the same will be accessed again soon.

For example,

  • hardware caches like L1, L2, etc.
  • the entries kept in the MMU's TLB
  • software caches like the page cache caching disk
  • LRU caches kept in programs
  • services like memcached

Another common one is spatial locality, the idea that if a location is accessed, it is likely that the one next to it will be accessed next.

In particular disks are often read out in reasonably-sized sequential chunks, so the disk and/or OS may readahead, caching the next few megabytes. (In pathological cases this is structurally worse - so then you disable it - but in most cases it's neutral to positive)

RAM access from execution have some more intricate patterns, so memory locality is treated as a special case. The design/layout of hardware caches like L1 and L2 is a study in itself. You can read up on why e.g. cache lines are a concept as they are, and how it relates to DDR bursts in current practice.

Branch locality happens when code jumps going to one of very few targets. This isn't necessarily closeby (spatial), depending on the size of the code and amount of outcomes.

It's more related to temporal, since in the shorter term, demand-driven cacheing means it's likely to still be cached.

It argues for compilers trying to keep hot code close together.

(Also related to branch prediction, though that is more about keeping CPU pipelines busy)

Sequential locality is when you can assume or detect a pattern like sequential data reading.

Spatial would cover part of that, but not e.g. at the edges of cache lines or for larger-interval accesses.(verify)

See also: