Multi-dimensional array ordering
|This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)|
When you have to serialize matrix data onto linear storage, you can choose between:
- Row-major means that elements logically adjacent within a row are adjacent in memory.
- Column-major means that elements logically adjacent within a column are adjacent in memory.
You should only need to think about this when
- saved data is effectively memory dumps, and can come from a source with different ordering from your code's / your current platform's
- you work with both orderings from the same code, e.g. when mixing C and Fortran
Most of the time you don't really have to think about this.
- Within a single language, its indexing convention does that for you
- One program reading its own fata from disk often implies much the same
The potential issue comes when you communicate data to libraries/languages that may have an opinion different than your own language's.
Historically this was C (row-major) versus Fortran (column-major), which is why this is sometimes called C ordering and Fortran ordering.
Opinions differ even now for efficiency reasons, varying on the most likely operations on (multi-dimensional) arrays. The idea is that iterating over contiguous memory will be faster due to spatial locality feeding the cache with a sort of implied readahead (read up on how hardware caches work).
Most modern general-purpose languages use row-major, while things like MATLAB, Octave, and statistical packages might use column-major.
Some things support or at least consider both, e.g. numpy.
Notes that the way you access such arrays, e.g. m[column][row], does not always reflect the memory layout (though frequently it does)
Which makes detailed explanation a lot more confusing than the concept really is.