On computer memory: Difference between revisions

From Helpful
Jump to navigation Jump to search
m (Redirected page to Template:Computer hardware)
Tag: New redirect
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
=CPU cache notes=
#redirect [[Template:Computer hardware]]
{{stub}}
 
CPU caches put a little faster-but-costlier [[SRAM]] (or similar) between CPU (registers are even faster) and main RAM (slowish, often DRAM).
 
 
CPU caches mirror fragments in main RAM, and remember where it came from.
Whenever  accesses towards main RAM can be served from cache, they are served faster
 
Today [Computer_/_Speed_notes order of 1 to 10ns instead of order of 100ns],
but the idea has been worth implementing in CPUs since they ran at a dozen MHz or so{{verify}}.
 
 
<!--
The above suggests the direction of reading only - sometimes a thing happens to be closer.
 
Because caches necessarily have to integrate into general memory management/access,
a very similar mechanism can do write caching.
: That is, instead of updating RAM and cache at the same time (write-through cache), we can use the cache as a write-back cache: the CPU writes to to faster cache, which then leads to an updates main RAM as soon as it can (and the cache mechanism you won't read the old data in the meantime)
: Write-through has less management but higher average latency.
: Writeback has more management, can be faster with few writes - and slower with more writes.
-->
 
These caches are entirely transparent, in that a user or even programmer should not ''have'' to care about how it does its thing,
and you could completely ignore their presence, and arguably shouldn't be able to control what it does at all.
 
 
As a programmer, you may like a general idea of how they work, because designing for caches in general can help speed for longer.
 
Optimizing for ''specific'' CPU's cache constructions, while possible, is often often barely worth it, and may even prove counterproductive for other CPUs, or even the same brand's a few years later.
If you remember just one thing, 'small data is a little likelier to stay in cache', and even that is less true if there are a lot of programs vying for CPU time.
 
 
It can also give slightly better [[spatial locality]] for individual programs.
 
Other things, like branch locality can help, but is largely up to the compiler.
 
A few things, like that arrays have sequential locality that e.g. trees do not, are more down to algorithm choice, but usually out of your hands.
 
And, in high level reflective OO style languages, you may have little control anyway.
 
 
Avoiding caches getting flushed more then necessary help, as can avoiding cache contention - so it helps to know what that is and why it happens. And see when.
 
 
<!--
Keep in mind that hardware caches are multi-level. The point to this is latency.
That is, there is value to placing very-low-latency very close, though it's pricy so we add little,
there is then value to adding another layer that is larger, somewhat higher-latency, but turns out much more affordable per byte.
This stays sensible as long as effective latency is lower than that of DRAM.
You may find:
: L1 per core (/HT pair)
:: often splits between data and instruction caches (particularly with HT?)
: L2 share between a few, or still be per-core
: L3 (optional) may then be crossbarred
: and details would vary with whether main RAM is accessed via QPI / HT (crossbar-ish NUMA style))
: L4 does happen, though
 
Notes:
* a CPU that uses a register file with more internal slots than exposed register names can be considered L0, though that term is a bit fuzzy)
* L2 may be allowed to share data L1, though it is much easier to manage
: L2 that never duplicates data from L1, so both setups exist.
 
* Multicore makes CPU caches more interesting
: because there are more caches, that all should be correct (cache coherence).
: This is managed entirely for you (these days many things use [https://en.wikipedia.org/wiki/MESI_protocol  MESI].
:: somewhat harder to design for because it's harder to understand (without diving in pretty deep)
 
 
 
 
For example, consider trying to keep things in cache for faster reading.
 
Reasons they may be present to be read:
- it's not in ''that'' cache, e.g. it may be in L2 but not in a specific core's L1  because it's the first reference ''from that core''
- enough other accesses that this got flushed (capacity miss, or conflict miss)
- another thread invalidated a cache line because it wrote to it (sharing miss) - see also cache models
: note that when it actually wrote to something else in the line, this is false sharing, and purely about the granularity of this management
All moderately obvious, and you may often decide that beyond making state smaller,
and perhaps run fewer programs, there is little you can do to affect that, this is just fact of life.
 
 
'''Cache contention''' refers to cache system doing a lot of work that turns out to not be useful, and may be slowing accesses.
 
One big cause is when sharing misses becomes an issue: there are many writes (e.g. by many threads from the same program) writing to nearby addresses (e.g. the same struct),
and the cache system is spending most of its time fetching the correct data,
which makes it slower than if access patterns would somehow not cause that many fetches
(though often still faster than RAM until you're talking dozens of threads contended on the same thing)
 
Notes:
* it's interesting to realize that reads do not conflict with reads. Reads with writes and writes with writes do.
 
* Avoiding conflicts makes the same amounts of threads run more efficiently, and makes things scale better.
 
* At a higher level, cache contention can also refer to multiple levels of caches interacting badly, or just
 
* look at false sharing[https://en.wikipedia.org/wiki/False_sharing]
 
* coherent caches (consider per-core caches, shared L2, multiple levels of cache) make this more interesting.
: you can analyse this down to which cores share L2 caches so invalidate less often - but that's the level that may work differently years on
 
* GPU cache contention has its own patterns
 
 
https://www.youtube.com/watch?v=JE-jSZ8zToM
 
 
 
====Cache schemes====
 
: '''Why cache lines and slots?'''
 
A '''cache line''' is such a chunk of data from memory (very typically of fixed size).
 
It makes more sense to mirror chunks of memory (e.g. dozens of bytes before the metadata you need to manage it becomes a smallish part of the cache), rather than individual locations, as the overhead is basically the same but you usually get something useful out of the assumption of [[spatial locality]] (nearby bytes tend to be accessed soon, e.g. when doing loops or copying ranges).
 
 
'''Slots''' are implementation detail on top of lines.
 
It's a bunch of extra bits of metadata, mostly for management - which things do we still need to write, what address does this slot currently represent, which slot do we want to evict?
 
 
 
: '''how to map such lines to memory?'''
 
A software cache is often some variant of a key-based hashmap, because this means we can get to the value quickly without scanning everything we have.
 
Hardware caches are analogous, with the address being the key. However, there is a lot of overhead in a fullblown hashmap, and the entire point was to avoid as much latency as we can.
 
So we need something simpler.
 
The simplest is a '''direct-mapped cache''', in which you use part of the address to decide the cache slot we use.
 
* '''fully associative''' is perhaps the most flexible - it's basically a list of (address,line) entries
: which means lookup has to go through ''everything''
: ...though higher hit rates
: You want that in parallel to keep down latency, which is complex and pricy and doesn't scale well. Which is why these are basically not used
 
* '''direct mapping''' is perhaps the simplest scheme
: each location in RAM can go to just one location in cache
: basically, take part of the address, meaning an address will always map to the same slot
: upside is that lookup is very fast: you have to check just one slot
: the downside is that the cache tends to be used unevenly (depending on how things sit in memory)
:: typically meaning lots of eviction of a few slots, and a lot of cache misses 
:: in practice, such a cache needs to be a factor larger to give comparable performance
 
* '''n-way set associative'''
: in part the compromise between the above
: it maps in a way so any given address maps to just a few slots - a 2-way in 2 slots, 4-way in 4 slots, 8-way in 8 slots, etc
:: you can think of it as direct mapping only applies within each set
: a compromise in that you do relatively few checks, and avoid a lot of cache misses
: so gets better hits in the same amount of silicon
 
* n-way associative
: if direct mapping is
 
 
So which is used?
n-way, mostly. Depends on the purpose.
 
 
 
-->
 
=On virtual memory=
{{stub}}
 
 
Virtual memory ended up doing a number of different things,
which for the most part can be explained separately.
 
 
===Intro===
<!--
A '''virtual memory system''' is one in which running code never deals ''directly'' with physical addresses.
 
Instead,
each task gets its own address space.
some sort of translation, between the addresses that the OS/programs see, and the physical addresses and memory that actually goes to, via a lookup table.
 
 
No matter the addresses used within each task, they can't clash in physical memory (or rather, ''won't'' overlap until the OS specifically allows it - see shared memory).
 
There are a handful of reasons this can be useful. {{comment|(Note: this is a broad-strokes introduction that simplifies and ignores a lot of historical evolution of how we got where we are and ''why'' - a bunch of which I know I don't know)}}.
 
 
The larger among these ideas is '''protected memory''': that lookup can easily say "that is not allocated to you, ''denied''", meaning a task can never accidentally access memory it doesn't own. (once upon a time any program could access any memory, but this has practical issues)
 
This is useful for stability, in that a user task can't bring down a system task accidentally. Misbehaving tasks will fail in isolation.
 
It's also great for security, in that tasks can't do it intentionally - you can't read what anyone else is doing.
 
{{comment|(Note that you can have protection ''without'' virtual addresses, if you keep track of what belongs to a task. A few embedded systems opt for this because it can be a little simpler (and a little faster) without that extra step of indirection. Yet in general you get and want both.)}}
 
 
Another reason is that processes (and most programmers) don't have to think about other tasks, the OS, or their management.  Say, in the DOS days, you all used the same memory space so memory management was a more cooperative thing -- which is a pain and one of various reasons you would run one thing at a time (with few exceptions).
 
 
There are other details, like
* an OS can effectively unify underlying changes over time, varying hardware lookup/protection implementations, with extensions and variations even in the same CPU architecture/family.
 
* it can make fragments of RAM look contiguous to a process, which makes life much easier for programmers, and has negligible effect on speed (because of the RA in RAM).
: generally the VMM does try to minimise fragmentation where possible, because too much can trash the fixed-size TLB
 
* on many systems, the first page in a virtual address space is marked unreadable, which is how null pointer references can be caught more easily/efficiently than on systems without MMU/MPUs.
 
* In practice it matters that physical/virtual mapping is something a cache system can understand. There are other solutions that are messier.
 
 
 
 
'''Lower levels'''
 
Which bits of memory belongs to what task is ''managed'' by Virtual Memory Manager (VMM) - which is software, and part of the OS kernel.
 
 
Since memory access is so central to function, as well as speed of things, the actual translation between virtual and physical address is typically offloaded to dedicated hardware, often called the MMU (Memory Management Unit), and its Translation lookaside buffer (TLB) is large part of that. {{comment|(which, since it has a limited size, it is essentially a cache of a fuller list managed by the VMM, which is slightly slower to access. Which is why soft page faults not errors, actually quite normal, though still something you want to minimise)}}.
 
 
'''These days'''
 
The computer you're working on right now most likely has an MMU.
 
Some systems don't virtualise but still protect, in which case it's probably called a Memory Protection Unit (MPU). This is done on some embedded platforms (e.g. some ARMs), r.g. for real-time needs, as an MMU can in some (relatively rare) cases hold you up a few thousand cycles.
 
And some no neither - in particular simpler microcontrollers, which run just one program, and any sort of multitasking is cooperative.
 
 
 
'''Mid to high levels'''
 
Once the VMM was a thing, it allowed ideas more complex than just dividing memory.
 
This includes (and is not limited to):
* overcommitting RAM and virtual memory
* swapping / paging
* memory mapped IO
* sharing libraries between processes (read-only)
* sharing memory between processes (read/write)
* sharing memory between kernel and processes
* lazy backing of allocated memory (e.g. allocate on use, copy on write)
* system cache (particularly the disk cache) that can always yield to process allocation
 
Most such ideas ended up entangled with each other, which is what makes it hard to have a complete view on what exactly modern memory management in a modern OS is doing.  Which is fine in that it's not very important unless you're a kernel programmer, or maybe doing some ''very'' specific bug analysis.
 
Still, it helps to have a basic grasp on what's going on - even just to read out memory statistics.
 
 
 
Knowing some terms also helps having sensible distinctions, like that
: '''mapped''' memory often means 'anything a task can see, according to the VMM'
:: which includes things beyond what is uniquely allocated to just it
: '''committed''' memory often means it has guaranteed backing, be in RAM or on disk
 
: Most systems have a distinct name between 'free as in unused' and 'free as in used for caches but can be yielded almost immediately when you ask for it'
:: how this is reported varies between OSes, and versions. For example, it seems only recent windows makes an explicit distinction between 'Free' and 'Standby'.
 
 
 
Things like 'total used memory' is not as simple as you'ld think.
 
Consider:
* shared libraries can be many processes mapping just one bit of memory (read-only{{verify}})
 
* shared memory is multiple processes mapping the same memory (read-write)
 
* [[memory mapped IO]] and [[memory mapped files]]
: which are not backed by RAM, just have the VMM pretending they're there
 
 
There is a useful distinction between private memory (only available to one task) is simple, but there is also '''shareable memory'''.
 
Shared-anything should probably be counted just once in summaries,
and e.g. memory mapped files not even once because they're just an abstraction.
 
And then there's swapping, a topic in itself.
 
And then there's overcommit, where we allow programs to ask for a little more memory
than we have storage to actively back it. Another separate topic.
 
-->
 
 
===Overcommitting RAM with disk: Swapping / paging; trashing===
 
<!--
{{comment|(Note that what windows usually calls paging, unixen usually call swapping. In broad descriptions you can treat them the same. Once you get into the details the workings and terms do vary, and precise use becomes more important.)}}
 
 
Swapping/paging is, roughly, the idea that the VMM can have a pool of virtual memory that comes from RAM ''and'' disk.
 
This means you allocate more total memory than would fit in RAM at the same time. {{comment|(It can be considered overcommit of RAM, though note this is ''not'' the usual/only meaning the term overcommit, see below)}}.
 
The VMM decides which parts go from RAM to disk, when, and how much of such disk-based memory there is.
 
 
Using disk for memory seems like a bad idea, as disks are significantly slower than RAM in both bandwidth and latency.
 
Which is why the VMM will always prefer to use RAM.
 
 
There are a few reasons it can make sense:
 
* there is often some percentage of each program's memory that is inactive
: think "was copied in when starting, and then never accessed in days, and possibly never will be"
: if the VMM may adaptively move that to disk (where it will still be available if requested, just slower), that frees up more RAM for ''active'' programs (or caches) to use.
: not doing this means a percentage of RAM would always be entirely inactive
: doing this means slow access whenever you ''did'' need that memory after all
: this means a small amount of swap space is almost always beneficial
: it also doesn't make that much of a difference, because most things that are allocated have a purpose. '''However'''...
 
 
* there are programs that blanket-allocate RAM, and will never access part/most of it even once.
: as such, the VMM can choose to not back allocated with anything, until the first use.
:: this mostly just saves some up-front work
: separately, there is a choice of how to count this not-yet used memory
:: you could choose to not count that memory at all - but that's risky, and vague
:: usually it counts towards  ''swap/page'' area  (often without ''any'' IO there)
: this means a ''bunch'' of swap space can be beneficial, even if just for bookkeeping without ever writing to it
:: just to not have to give out memory we never used
:: while still actually having backing if they ever do.
 
 
And yes, in theory neither would be necessary if programs behaved with perfect knowledge of other programs, of the system, and of how their data gets used.
 
In practice this usually isn't feasible, so it makes sense to do this at OS-level basically with a best-guess implementation.
 
In most cases it has a mild net positive effect, largely because both above reasons mean there's a little more RAM for active use.
 
 
Yes, it is ''partly'' circular reasoning, in that programmers now get lazy doing such bulk allocations knowing this implementation, thereby ''requiring'' such an implementation.
Doing it this way has become the most feasible because we've gotten used to thinking this way about memory.
 
 
 
Note that neither reason should impact the memory that programs actively use.
 
Moving inactive memory to disk will also rarely slow ''them'' down.
Things that periodically-but-very-infrequently do a thing may need up to a few extra seconds.
 
 
There is some tweaking to such systems
* you can usualy ask for RAM that is never swapped/paged. This is important if you need to guarantee it's always accessed within very low timespan (can e.g. matter for real-time music production based on samples)
 
* you can often tweak how pre-emptive swapping is
: To avoid having to swap things to disk during the next memory allocation request, it's useful to do so pre-emptively, when the system isn't too busy.
: this is usually somewhat tweakable
 
 
 
 
'''Using more RAM than you have'''
 
The above begs the question what happens when you attempting to actively to use more RAM than you have.
 
This is a problem with ''and'' without swapping, with and without overcommit.
 
 
Being out of memory is a pretty large issue. Even the simplest "use what you have, deny once that's gone" system would have to just deny allocations to programs.
 
Many programs don't check every allocation, and may crash if not actually given what they ask for.  But even if they handled denied allocation perfectly elegantly, in many cases the perfect behaviour still often amount to stopping the program.
 
 
Either way, the computer is no longer able to do what you have.
 
And there is an argument that it is preferable to have it continue, however slowly,
in the hope this was some intermittent bad behaviour that will be solved soon.
 
 
When you overcommit RAM with disk, this happens somewhat automatically.
And it's slow as molasses, because some of the actively used memory is now going not via microsecond-at-worst RAM but millisecond-at-best disk.
 
 
While there are cases that are less bad, it's doing this ''continuously'' instead of sporadically.
 
This is called '''trashing'''. If your computer suddenly started to continously rattle its disk while being verrry slow, this is what happened.
 
 
{{comment|(This is also the number one reason why adding RAM may help a lot for a given uses -- or not at all, if this was not your problem.)}}
 
 
===Overcommitting (or undercommitting) virtual memory, and other tricks===
<!--
 
Consider we have a VMM system with swapping, i.e.
* all of the actively used virtual memory pages are in RAM
* infrequently used virtual memory pages are on swap
* never-used pages are counted towards swap {{comment|(does ''not'' affect the ammount of allocation you can do in total)}}
 
Overcommit is a system where the last point can instead be:
* never-used pages are nowhere.
 
 
'''More technically'''
 
More technically, overcommit allows allocation of address space, without allocating memory to back it.
 
Windows makes you do both of those explicitly,
implying fairly straightforward bookkeeping,
and that you cannot do this type of overcommit.
{{comment|(note: now less true due to compressed memory{{verify}})}}
 
 
Linux implicitly  allows that separation,
basically because the kernel backs allocations only on first use {{comment|(which is also why some programs will ensure they are backed by something by storing something to all memory they allocate)}}.
 
Which is separate from overcommit; if overcommit is disabled is merely saves some initialisation work.
But with overcommit (and similar tricks, like OSX's and Win10's compressed memory, or linux's [[zswap]]) your bookkeeping becomes more flexible.
 
Which includes the option to give out more than you have.
 
 
'''Why it can be useful'''
 
Basically, when there may be a good reason pages will ''never'' be used.
 
The difference is that without overcommit this still needs to all count towards something (swap, in practive), but and that overcommit means the bookkeeping assumes you will always have a little such used-in-theory-but-never-practice use.
 
How wise that is depends on use.
 
 
There are two typical examples.
 
 
One is that a large process may fork().
In a simple implementation you would need twice the memory,
but in practice the two forks' pages are copy-on-write, meaning they will be shared until written to.
Meaning you still need to do bookkeeping in case that happens, but even if it's another worker it probably won't be twice.
 
In the specific case where it wasn't for another copy of that program, but to instead immediately exec() a small helper program, that means the pages will ''never'' be written.
 
 
The other I've seen is mentioned in the kernel docs: scientific computing that has very large, very sparse arrays.
This is essentially said computing avoiding writing their own clever allocator, by relying on the linux VMM instead.
 
 
 
Most other examples arguably fall under "users/admins not thinking enough".
Consider the JVM, which has its own allocator which you give an initial and max memory figure at startup.
Since it allocates memory on demand (also initialises it{{verify}}), the user may ''effectively'' overcommit by having the collective -Xmx be more than RAM.
That's not really on the system to solve, that's just bad setup.
 
 
 
 
'''Critical view'''
 
Arguably, having enough swap makes this type of overcommit largely unnecessary, and mainly just risky.
 
The risk isn't too large, because it's paired with heuristics that disallow silly allocations,
and the oom_killer that resolves most runaway processes fast enough.
 
 
 
It's like [https://en.wikipedia.org/wiki/Overselling#Airlines overselling aircraft seats], or [https://en.wikipedia.org/wiki/Fractional-reserve_banking fractional reserve banking].
 
It's a promise that is ''less'' of a promise, it's fine (roughly for the same reasons that systems that allow swapping are not continuously trashing), but once your end users count on this, the concept goes funny, and when everyone comes to claim what's theirs you are still screwed.
 
 
Note that windows avoids the fork() case by not having fork() at all (there's no such cheap process duplication, and in the end almost nobody cares).
 
 
Counterarguments to overcommit include that system stability should not be based on bets,
that it is (ab)use of an optimization that you should not be counting on,
that programs should not be so lazy,
and that we are actively enabling them to be lazy and behave less predictably,
and now sysadmins have to frequently figure out why that [[#oom_kill|oom_kill]] happened.
 
 
Yet it is harder to argue that overcommit makes things less stable.
 
Consider that without overcommit, memory denials are more common (and that typically means apps crashing).
 
With or without overcommit, we are currently already asking what the system's emergency response should be (and there is no obvious answer to "what do we sacrifice first") because improper app behaviour is ''already a given''.
 
 
Arguably oom_kill ''can'' be smarter, usually killing only an actually misbehaving program.
Rather than a denial probably killing the next program (more random).
 
But you don't gain much reliability either way.
 
{{comment|(In practice oom_kill can take some tweaking, because it's still possible that e.g. a mass of smaller
programs lead to the "fix" of your big database getting killed)}}
 
 
 
'''So is it better to disable it?'''
 
No, it has its benificial cases, even if they are not central.
 
Disabling also won't prevent swapping or trashing,
as the commit limit is typically still > RAM {{comment|(by design, and you want that. Different discussion though)}}.
 
But apps shouldn't count on overcommit as a feature, unless you ''really'' know what you're doing.
 
Note that if you want to keep things in RAM, you probably want to lower [[#swappiness|swappiness]]) instead.
 
 
 
 
'''Should I tweak it?'''
 
Possibly.
 
Linux has three modes:
* overcommit_memory=2: No overcommit
: userspace commit limit is swap + fraction of ram
: if that's &lt;RAM, the rest is only usable by the kernel, usually mainly for caches (which can be a useful mechanism to dedicate some RAM to the [[page cache]])
 
* overcommit_memory=1: Overcommit without checks/limits.
: Appropriate for relatively few cases, e.g. the very-space array example.
: in genera just more likely to swap and OOM.
 
* overcommit_memory=0: Overcommit with heuristic checks (default)
: refuses large overcommits, allows the sort that would probably reduce swap usage
 
 
 
These mainly control the maximum allocation limit for userspace programs.
This is still a fixed number, and still ''related'' to the amount of RAM, but the relation can be more interesting.
 
On windows it's plainly what you have:
swap space + RAM
 
and linux's it's:
swap space + (RAM * (overcommit_ratio/100) )
or, if you instead use overcommit_kb,
swap space + overcommit_kb {{verify}}
 
 
Also note that 'commit_ratio' might have been a better name,
because it's entirely possible to have that come out as ''less' than RAM, undercommit if you will.
 
This undercommit is also a sort of feature, because while that keeps applications from using it,
this ''effectively'' means it's dedicated to (mainly) kernel cache and buffers.
 
 
 
Note that the commit limit is ''how much'' it can allocate, not where it allocates from (some people assume this based on how it's calculated).
 
Yet if the commit limit is less than total RAM, applications will never be able to use all RAM.
This may happen when you have a lot of RAM and/or very little swap.
 
 
Because when you use overcommit_ratio (default is 50), the value (and sensibility of the) of the commit limit essentially depends on the ''ratio'' between swap space and RAM.
 
Say,
: 2GB swap, 4GB RAM, overcommit_ratio=50: commit limit at (2+0.5*4) = 4GB.
: 2GB swap, 16GB RAM overcommit_ratio=50: (2+0.5*16) = 10GB.
: 2GB swap, 256GB RAM overcommit_ratio=50: (2+0.5*256) = 130GB.
 
: 30GB swap, 4GB RAM overcommit_ratio=50: (30+0.5*4) = 32GB.
: 30GB swap, 16GB RAM overcommit_ratio=50: (30+0.5*16) = 38GB.
: 30GB swap, 256GB RAM overcommit_ratio=50: (30+0.5*256) = 156GB.
 
 
So
* and/or (more so if you have a lot of RAM) you may consider setting overcommit_ratio higher than default
: possibly close to 100% {{comment|(or use overcommit_kb instead because that's how you may be calculating it anyway)}}
: and/or more swap space.
 
* if you desire to leave some dedicate to caches (which is a good idea) you have to do some arithmetic.
: For example, witjh 4GB swap and 48GB RAM,
:: you need ((48-4)/48=) ~91% to cover RAM,
:: and ((48-4-2)/48=) ~87% to leave ~2GB for caches.
 
* this default is why people suggest your swap area should be roughly as large as your RAM (same order of magnitude, anyway)
 
 
 
 
 
 
 
'''Should I add more RAM instead?'''
 
Possibly. It depends on your typical and peak load.
 
More RAM improves performance noticeably only when it avoids swapping under typical load.
 
It helps little beyond that. It helps when it means files you read get cached (see [[page cache]]),
but beyond that has ''no'' effect.
 
 
 
 
 
 
 
 
 
Other notes:
* windows is actually more agressive about swapping things out - it seems to favour favour of IO caches
* linux is more tweakable (see [[#swappiness|swappiness]]) and by default is less aggressive.
 
 
* overcommit makes sense if you have significant memory you reserve but ''never'' use
: which is, in some views, entirely unnecesssary
: it should probably be seen as a minor optimization, and not a feature you should (ab)use
 
 
 
Unsorted notes
* windows puts more importance on the swap file
 
* you don't really want to go without swap file/space on either windows or linux
: (more so if you turn autocommit off on linux)
 
* look again at that linux equation. That's ''not'' "swap plus more-than-100%-of-RAM"
: and note that if you have very little swap and or tons of RAM (think >100GB), it can mean your commit limit is lower than RAM
 
* swap will not avoid oom_kill altogether - oom_kill is triggered on low speed of freeing pages {{verify}}
 
 
 
 
 
-->
 
<!--
See also:
* https://serverfault.com/questions/362589/effects-of-configuring-vm-overcommit-memory
 
* https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
 
* https://www.win.tue.nl/~aeb/linux/lk/lk-9.html
 
* http://engineering.pivotal.io/post/virtual_memory_settings_in_linux_-_the_problem_with_overcommit/
 
* https://serverfault.com/questions/362589/effects-of-configuring-vm-overcommit-memory
 
-->
 
===Page faults===
 
<!--
Consider that in a system with a VMM, applications only ever deal in virtual addresses; it is the VMM that implies translation to real backing storage.
 
 
And when doing memory access, one of the options is this access makes sense (the poage is known, and considered accessible) but cannot be accessed in the lightest sort of pass-through-to-RAM way.
 
A page fault, widely speaking, means "instead of direct access, the kernel needs to decide what to do now".
 
That signalling is called a page fault {{comment|(Microsoft also uses the term 'hard fault')}}.
 
Note that it's a ''signal'', caught by the OS kernel. It's called 'fault' only for historical low-level design reasons.
 
 
This can mean one of multiple things. Most said ground is covered by the following cases:
 
 
'''Minor page fault''', a.k.a. '''soft page fault'''
: Page is actually in RAM, but not currently marked in the MMU page table (often due to its limited size{{verify}})
: resolved by the kernel updating the MMU, ''then'' just allowing the access.
:: No memory needs to be moved around.
:: Very little extra latency
: (can happen around shared memory, or around memory that has been unmapped from processes but there had been no cause to delete it just yet - which is one way to implement a page cache)
 
 
'''Major page fault''', a.k.a. '''hard page fault'''
: memory is mapped, but not currently in RAM
: i.e. mapping on request, or loading on demand -- which how you can do overcommit and pretend there is more memory (which is quite sensible where demand rarely happens)
: resolved by the kernel finding some space, and loading that content. Free RAM, which can be made by swapping out another page.
:: Adds noticeable storage, namely that of your backing storage
:: the latter is sort of a "fill one hole by digging another" approach, yet this is only a real problem (trashing) when demand is higher than physical RAM
 
 
'''Invalid page fault'''
* memory isn't mapped, and there cannot be memory backing it
: resolved by the kernel raising a [[segmentation fault]] or [[bus error]] signal, which terminates the process
 
 
 
DEDUPE WITH ABOVE
 
 
or not currently in main memory (often meaning swapped to disk),
or it does not currently have the backing memory mapped{{verify}}.
 
Depending on case, these are typically resolved either by
* mapping the region and loading the content.
: which makes the specific memory access significantly slower than usual, but otherwise fine
 
* terminating the process
: when it failed to be able to actually fetch it
 
 
 
Reasons and responses include:
* '''minor page fault''' seems to includes:{{verify}}
** MMU was not aware that the page was accessible - kernel inform it is, then allows access
** writing to copy-on-write memory zone - kernel copies the page, then allows access
** writing to page that was promised by the allocated, but needed to be - kernel allocates, then allows access
 
* mapped file - kernel reads in the requested data, then allows access
 
* '''major page fault''' refers to:
** swapped out - kernel swaps it back in, then allows access
 
* '''invalid page fault''' is basically
** a [[segmentation fault]] - send SIGSEGV (default SIGSEGV hanler is to kill the process)
 
 
Note that most are not errors.
 
In the case of a [[memory mapped IO]], this is the designed behaviour.
 
 
Minor will often happen regularly, because it includes mechanisms that are cheap, save memory, and thereby postpone major page faults.
 
Major ideally happens as little as possibly, because memory access is delayed by disk IO.
 
-->
 
See also
* http://en.wikipedia.org/wiki/Paging
 
* http://en.wikipedia.org/wiki/Page_fault
* http://en.wikipedia.org/wiki/Demand_paging
 
===Swappiness===
{{stub}}
 
<!--
The aggressiveness with with an OS swap out allocated-but-inactive pages to disk is often controllable.
 
Linux dubs this ''swappiness''. Higher swappiness mean the tendency to swap out is higher. {{comment|(other information is used too, including the currently mapped ratio, and a measure of how much trouble the kernel has recently had freeing up memory)}}{{verify}}
 
 
 
Swapping out is always done with cost/benefit considerations.
 
The cost is mainly the time spent,
the benefit giving RAM to caches, and to programs (then also doing some swapping now rather than later).
 
(note that linux swaps less aggressive than windows to start with, at least with default settings)
 
 
There are always pages that are inactive simply because programs very rarely use it (80/20-like access patterns).
 
But with plenty of free free RAM it might not even swap ''those'' out, because benefit is so low.
I had 48GB and 256GB workstations at work and people rarely got them to swap ''anything''.
 
 
 
It's a gliding scale. To illustrate this point, consider the difference between:
 
* using more RAM than we have - we will probably swap in response to every allocation
: or worse, in the case of trashing: we are swapping purely to avoid crashing the system
: Under high memory strain, cost of ''everything'' is high, because we're not swapping to free RAM for easier future use, we're swapping to not crash the system.
 
 
* Swapping at any other time is mainly about pro-actively freeing up RAM for near-future use.
: being IO we otherwise have to concentrate to during the next large allocation request
:: arguing for ''higher'' swappiness, because it will effectively spread that work over time,
 
 
These are entirely different cases.
* The former clobbers caches, the latter builds it up
 
* the former ''is'' memory strain, the latter ''may'' lessen it in the future
: (if the peak use is still sensible, and won't trash itself along with everything else)
 
 
 
Arguments for '''lower''' swappiness:
* Delays putting things on slower disk until RAM is necessary for something else
** ...avoiding IO (also lets drives spin down, which can matters to laptop users)
** (on the flipside, when you want to allocate memory ''and'' the system needs to swap out things first to provide that memory, it means more work, IO, and sluggishness concentrated at that time)
 
* apps are more likely to stay in memory (particularly larger ones). Over-aggressive swapout (e.g. inactivity because you went for coffee) is less likely, meaning it is slightly less likely that you have to wait for a few seconds of churning swap-in when you continue working
: not swapping out GUI programs makes them ''feel'' faster even if they don'
 
* When your computer has more memory than you actively use, there will be less IO caused by swapping inactive pages out and in again (but there are other factors that ''also'' make swapping less likely in such cases)
 
 
Arguments for '''higher''' swappiness seem to include{{verify}}:
* When there is low memory pressure, caches is what makes (repetitive) disk access faster.
 
* keeps memory free
** spreads swap activity over time, useful when it is predictably useful later
** free memory is usable by the OS page cache
 
* swapping out rarely used pages means new applications and new allocations are served faster by RAM
: because it's less likely we have to swap other things out at allocation time
 
* allocation-greedy apps will not cause swapping so quickly, and are served more quickly themselves
 
 
 
 
'''On caches'''
 
Swappiness applies mostly to process's memory, and not to kernel constructs like the OS page cache, dentry cache, and inode cache.
 
 
That means that swapping things out increases the amount of OS page cache we have.
 
 
From a perspective of data caching, you can see swappiness as one knob that (somewhat indirectly) controls how likely data will sit in a process, OS cache, or swapped out.
 
 
 
Consider for example the case of large databases (often following some 80/20-ish locality patterns).
 
If you can make the database cache data in its own process memory, you may want lower swappiness, since that makes it more likely that needed data is still in memory.
 
 
If you ''disable'' that in-process caching of tables, then might get almost the same effect, because the space freed is instead left of the OS page cache, which may then store all the file data you read most - which can be entirely the same thing (if you have no other major programs on the host).
 
{{comment|(In some cases (often  mainly 'when nothing else clobbers it'), the OS page cache is a simple and great solution. Consider how a file server will automatically focus on the most common files, transparently hand it to multiple processes, etc.
 
Sure, for some cases you design something smarter, e.g. a LRU memcache.
 
And of course this cache is bad to count on when other things on the server start vying for the same cache (and clobbering it as far as you're concerned).
 
This also starts to matter when you fit a lot of different programs onto the same server so they start vying for limited memory.
 
 
 
'''Server versus workstation'''
 
 
 
There is some difference between server and workstation.
 
Or rather, a system that is more or less likely to touch on the same data repeatedly,
hence value caches. A file server typically will, other servers frequently will.
 
 
Desktop tends to see relatively random disk access so cache doesn't matter much.
 
Instead, you may care to avoid GUI program swapped out much,
by having ''nothing'' swap out even when approaching memory pressure.
 
This seems like micromanaging for a very specific case (you're off as badly at actual memory pressure, and off as well when you have a lot of free RAM), but it might sometimes apply.
 
 
 
'''Actual tweaking'''
 
 
There is also:
* vm.swappiness -
* vm.vfs_cache_pressure -
 
 
 
 
 
 
In linux you can use proc or sysctl to check and set swappiness
cat /proc/sys/vm/swappiness
sysctl vm.swappiness
...shows you the current swappiness (a number between 0 and 100), and you can set it with something like:
echo 60 >  /proc/sys/vm/swappiness
sysctl -w vm.swappiness=60
 
 
 
 
 
 
 
 
This is '''not'' a percentage, as some people think. It's a fudgy value, and hasn't meant the same thing for all iterations of the code behind this.
 
Some kernels do little swapping for values in the range 0-60 (or 0-80, but 60 seems the more common tipping point).
 
It seems gentler tweaking is in the 20-60 range.
 
 
A value of 100 or something near it tends to make for very aggressive swapping.
 
* 0 doesn't disable, but should be pretty rare until memory pressure (which probably makes oom_kill likelier to trigger)
 
* Close to 100 is very aggressive.
 
 
1 is enabled by very light
 
up to 10
 
 
Note that the meaning of the value was never very settled, and has changed with kernels versions {{comment|(for example, (particularly later) 2.6 kernels swap out more easily under the same values than 2.4)}}.
 
 
 
* If you swap to SSD, you might lower swappiness to make it live longer
: but memory use peaks will affect it more than swappiness
 
 
 
 
 
People report that
* interaction with a garbage collector (e.g. JVM's) might lead to regular swapping
: so argue for lower swappiness
 
* servers:
: 10 may ''may'' make sense e.g. on database servers to focus on caches
: on a dedicated machine, if what you keep in apps may instead be in OS cache it may matter little
:
 
* desktops:
: around &le;10 starts introducing choppiness and pauses (probably because it concentrates swapping IO to during allocation requests)
 
 
* VMs make things more interesting
 
* containers too make things more interesting
 
 
 
 
See also:
* http://lwn.net/Articles/83588/
 
* https://lwn.net/Articles/690079/
 
* https://askubuntu.com/questions/184217/why-most-people-recommend-to-reduce-swappiness-to-10-20
 
-->
 
===Practical notes===
 
====Linux====
<!--
 
It seems that *nix swapping logic is smart enough to do basic RAID-like spreading among its swap devices, meaning that a swap partition on every disks that isn't actively used (e.g. by by something important like a database) is probably useful.
 
 
Swap used for hibernation can only come from a swap partition, not a swap file {{comment|(largely because that depends too much on whether the underlying filesystem is mounted)}}.
 
 
Linux allows overcommit, but real world cases vary.
It depends on three things:
* swap space
* RAM size
* overcommit_ratio (defaults to 50%)
 
When swap space is, say, half of RAM size,
 
On servers/workstations with at least dozens of GBs of RAM,
this will easily mean overcommit_ratio should be 80-90 for userspace to be able to use most/all RAM.
 
If the commit limit is lower than RAM, the rest goes (mostly) to caches and buffers.
Which, note is often useful, and sometimes it can even be preferable to effectively have a little dedicated cache. 
 
-->
 
===="How large should my page/swap space be?"====
 
<!--
Depends on use.
 
Generally, the better answer is to consider:
* your active workload, what that may require of RAM in the worst case
: some things are fixed and low (lots of programs)
: some can scale up depending on use (think of editing a high res photo)
: some can make use anything they get (caches, database)
: some languages have their own allocators, which may pre-allocate a few GB but may never use it
: some programs are eager to allocate and less eager to clean up / garbage collect
 
* how much is inactive, so can be swapped out
: less than a GB in most cases, and a few GB in a few
 
* too much swap doesn't really hurt
 
 
This argues that throwing a few GB at it is usually more than enough,
maybe a dozen or two GB when you have hungry/careless programs.
 
 
Servers are sometimes little different.
 
The numbers tend to be bigger - but ideally also more predictable.
 
And tweaking can make more sense because of it.
For example, when two major services try to each use 70% of RAM for caches,
they'll end up pushing each other to swap (and both incur disk latency),
and you're better off halving the size of each,
when that implies means never involving disk.
 
 
Additionally:
* on linux, hibernation reserves space in swap
: so if you use this, you need to add RAM-size to the above
: doesn't apply to windows, it puts hibernation in a separate preallocated file
 
* on windows, a crash dump (when set to dump all RAM) needs it to b [https://support.microsoft.com/en-us/help/2860880/how-to-determine-the-appropriate-page-file-size-for-64-bit-versions-of]
: so if you use this, you need to add RAM-size to the above
 
 
 
"Your swap file needs to be 1.5x RAM size"
 
This is arbitrary.
 
As shown by that number changing over time.
I've even seen tables that also vary that factor significantly with RAM.
 
 
On basic PC use, at least 1GB is a good idea.
 
If you have a lot of RAM, you probably did that because you have memory hungry programs, and a few GB more is probably useful.
 
When RAM sizes like 2 or 4GB were typical (you know, ~2000 or so), this amounted to the same thing as 1.5x.
 
But these days, 1.5x means larger and larger amounts that will probably never bse used.  Which is not in the way, and often not ''that'' much much of a bite out of storage. It's not harmful, it's just y pointless.
 
 
 
"Too large a page file slows down your system"
 
I don't see how it could.
 
 
The only indirect way I can think of is paging behaviour that becomes more aggressive based on actual use but based on how much of each you have. But as far as I know that's not how it works.
 
Or perhaps if your page file fills your disk space to nearly full, and contributes to fragmentation of regular files.
And even that is less relevant now that SSDs are getting typical.
 
 
 
"Page file of zero is faster because it keeps everything in memory"
 
True-ish, but not enough to matter in most cases.
 
That is, most accesses, of almost all active programs,
will not be any faster or slower, because most active use come from RAM with or without swap enabled.
 
 
The difference is that
* the rarely-accessed stuff stays in RAM and will not be slower.
 
* you get less usable RAM, because it now holds everything that is never accessed.
: ...this reduction is sometimes sometimes significant, depending on the specific programs you run, and how much they count on inactive-swap behaviour.
 
* if swap means it's going to push things into it instead of deny allocation, then your system is more likely to recover eventually (see oom_kill), and not stop outright.
: this argues for maybe an extra RAM-size, because some programs are just that greedy
 
 
-->
 
===On memory scarcity===
 
<!--
On a RAM-only system you will find you at some point cannot find free pages.
 
 
When you've added swap and similar features,
you may find your bookkeeping says it can be done,
but in practice it will happen very slowly.
 
 
Also, having disconnected programs from the backing store,
only the kernel can even guess at how bad that is.
 
 
The most obvious case is more pages being actively used than there is physical RAM (can happen without overcommit, more easily with), but there are others. Apparently things like hot database backups may create so many [[dirty pages]] so quickly that the kernel decides it can't free anywhere near fast enough.
 
 
 
In a few cases it's due to a sudden (reasonable) influx of dirty pages, but otherwise transient.
 
But in most cases scarcity is more permanent, means we've started swapping and probably [[trashing]], making everything slow.
 
Such scarcity ''usually'' comes from a single careless / runaway,
sometimes just badly configured (e.g. you told more than one that they could take 80% of RAM), sometimes from a slew of (probably-related) programs.
 
-->
 
 
 
=====oom_kill=====
 
<tt>oom_kill</tt> is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time - as this is good indication that it's gotten to the point that we are trashing.
 
 
Killing processes sounds like a poor solution.
 
But consider that an OS can deal with completely running out of memory in roughly three ways:
* '''deny ''all'' memory allocations until the scarcity stops.'''
: This isn't very useful because
:: it will affect ''every'' program until scarcity stops
:: if the cause is one flaky program - and it usually is just one - then the scarcity may not stop
:: programs that do not actually check every memory allocation will probably crash.
:: programs that ''do'' such checks well may have no option but to stop completely (maybe pause)
: So in the best case, random applications will stop doing useful things - probably crash, and in the worst case your system will crash.
 
* '''delay memory allocations''' until they can be satisfied
: This isn't very useful because
:: this pauses all programs that need memory (they cannot be scheduled until we can give them the memory they ask for) until scarcity stops
:: again, there is often no reason for this scarcity to stop
: so typically means a large-scale system freeze (indistinguishable from a system crash in the practical sense of "it doesn't actually ''do'' anything")
 
* '''killing the misbehaving application''' to end the memory scarcity.
: This makes a bunch of assumptions that have to be true -- but it lets the system recover
:: assumes there ''is'' a single misbehaving process {{comment|(not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)}}
::: ...usually the process with the most allocated memory, though <tt>oom_kill</tt> logic tries to be smarter than that.
:: assumes that the system has had enough memory for normal operation up to now, and that there is probably ''one'' haywire process (misbehaving or misconfigured, e.g. (pre-)allocates more memory than you have)
:: this ''could'' misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)
 
 
 
Keep in mind that
 
* oom_kill is sort of a worst-case fallback
: generally
:: if you feel the need to rely on the OOM, '''don't.'''
:: if you feel the wish to overcommit, don't
: oom_kill is meant to deal with pathological cases of misbehaviour
:: but even then might pick some random daemon rather than the real offender, because in some cases the real offender is hard to define
Tweak likely offenders, tweak your system.
: note that you can isolate likely offenders via [[cgroups]] now.
:: and apparently oom_kill is now cgroups-aware
 
* oom_kill does not always save you.
: It seems that if your system is [[trashing]] heavily already, it may not be ''able'' to act fast enough.
: (and possibly go overboard once things do catch up)
 
* You may wish to disable oom_kill when you are developing
: ...or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.
 
* If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:
vm.panic_on_oom=1
and a nonzero kernel.panic (seconds to show the message before rebooting)
kernel.panic=10
 
 
 
See also
* http://mirsi.home.cern.ch/mirsi/oom_kill/index.html
 
 
 
<!--
=====SLUB: Unable to allocate memory on node=====
 
SLUB is [[slab allocation]], i.e. about dynamic allocation of kernel memory
 
 
This particular warning seems most related to a bug in memory accounting.
 
 
 
It seems more likely to happen around containers with cgroup kmem accounting,
(not yet stable in 3.x, and apparently there are still footnotes in 4.x)
but happens outside as well?
 
 
There was a kernel memory leak
 
 
-->
 
===Glossary===
<!--
 
 
Most modern memory management systems are virtual memory systems that have swapping,
combined with a processor that has memory management unit (MMU) that does much of low-level work of mapping between virtual and physical memory.
 
 
Storage hierarchy (though it's usually more like a fallback list):
* '''Main memory''' - memory wired almost directly to the CPU (contasted with backing store), typically referring to [http://en.wikipedia.org/wiki/DRAM DRAM]
 
* '''Backing store'''
** often means 'storage on disk' (and that you are dealing with a swapping system that calls this store '''swap space''')
** ...though in general in a VMM system, it can also refer to RAM ''or'' disk that is committed to backing virtual memory
 
 
* '''Address space''' can refer to either:
** Virtual address space - a (usually distinct for each process)
** Physical address space - the space of valid physical addresses (sometimes: absolute addresses)
 
* '''Virtual address space''' may include
** uniquely mapped memory
** memory mapped IO
** shared memory (common for shared libraries, also used for IPC and such)
 
* '''Virtual memory''' - the meaning varies somewhat. Can refer to various related concepts:
** the concept of using swap to create more committable memory than physical memory (this not being real memory)
** a process's virtual address space
** See also [[#On virtual memory|below]].
 
 
* '''shared memory''' - the case where multiple processes intentionally map the same area of memory.
** Sometimes used to have fewer read-only copies of something in memory
** sometimes used for [[IPC|inter-process communication]].
 
 
* committed/mapped memory - virtual memory that is associated with physical memory. The distinction exists because (particularly) virtual memory systems may reserve virtual memory but not commit it until the range it comes from is accessed (because it may never be).
 
(Note: 'Mapped' is a little comfusing because 'mapping' is also a fairly obvious word to choose for translation and lookup of addresses)
 
* '''commit limit''' - a property of the system as a whole: The amount of memory we are prepared to give to allocations. Typically ''swap + overcommit_factor * physical_ram)''. Some systems do not allow overcommiting (overcommit_factor=1). If overcommit_factor &lt;1.0, we cannot back all promised memory with physical storage - but the point is that very usually, some portion of allocated memory will never be used, and for that portion it is pointless and wasteful to keep physical storage aside.
 
* '''overcommitting''' means that when you ask the kernel for a lot of memory:
** In a non-overcommitting system, the kernel answers with "No. I don't have all of that."
** In an overcommit system, the kernel answers with "Eh, we'll see how many of those pages you'll actually end up using."
** An alternative to overcommitting is adding a ''lot'' of swap space so the system can guarantee backing, with a lot of disk space that will probably never be used {{comment|(well, and RAM, but most VMMs are smart enough to consider unused pages as swapped out)}}.
** ...so depending on how applications work and allocate (...and count on overcommit logic), you may see serious amounts of overcommitment without seeing much RAM use ''or'' swapping.
** See also [[#Overcommit_or_not|the notes below]]
 
 
 
* '''paging, swapping''' - the act of moving data in and out of main memory. See also [[On swapping|below]]
* paged in / swapped in  - data that available in main memory
* paged out / swapped out - data not available in main memory (in most modern systems, this can only mean it is present in swap space)
 
* '''Trashing''' - tends to refer to situations where there is more actively used memory than there is RAM, meaning that relatively actively used content is continuously swapped in and out, making a lot of memory accesses disk speed rather than RAM speed - making the overall computer response very slow. The cause for this often means it will not stop within reasonable time.
 
* Resident set: the part
 
If the total resident set is less than main memory, you'll probably get trashing.
-->
 
 
<!--
Also related:
* Page cache - recently accessed disk area kept temporarily in memory for faster repeated accessed (sometimes with basic prediction, readahead, and such)
* CPU cache - a very specific (and relatively small )
 
See also [[Cache and proxy notes#Real-world_caches]]
-->
 
 
<!--
===Measures in linux===
 
* VmRSS: Resident set size (RES in top?)
* VmHWM: Peak resident set size ('high water mark')
 
* VmData, VmStk, VmExe: Size of data, stack, and text segments.
 
* VmSize: Virtual memory size (VIRT in top)
* VmPeak: Peak virtual memory size.
 
* VmLck: Locked memory size.
 
* VmLib: Shared library code size.
-->
 
=On memory fragmentation=
 
==Fragmentation in general==
<!--
 
Since you don't want to manage all memory by yourself, you usually want something to provide an allocator for you, so you can ask "I would like X amount of contiguous memory now, please" and it'll figure out where it needs to come from.
 
Since an allocator will serve many such requests over time, and programs ask for fairly arbitrary-sized chunks,
over time these allocations end up somewhat arbitrarily positioned, with arbitrarily sized holes in-between.
 
These holes will serve smaller but not larger requests, and mean programs get chunks that are fairly randomly positioned in physical memory.
 
 
 
Fragmentation in general can be bad for a few reasons:
* allocation becoming slightly slower because its bookkeeping gets more complex over time
 
* slightly lower access speed due to going to more places - e.g. when you think things are in sequence but actually comes from different places in RAM
: since most memory is random-access (but there are details like hardware caches), the overhead is small and the difference is small
 
* holes that develop over time means the speed at which memory fragments increases over time
 
 
 
 
This is basically irrelevant for physical memory, mostly because the translation between physical and virtual memory done for us.
 
Physical memory can fragment all it wants, the VMM can make it look entirely linear in terms of addressing we see, so there is almost no side effect on the RAM side.
 
It's still kept simple because the mechanism that ''does'' those RAM accesses involves a lookup table,
and it's better if most of that lookup table can stay in hardware (if interested, look for how the TLB works).
 
But all this matters more to people writing operating systems.
 
 
 
So instead, memory fragmentation typically refers to virtual address fragmentation.
 
Which refers to an app using ''its'' address space badly.
 
For example, its heap may, over time with many alloc()s and free()s, grow holes that fewer allocations can be served by, meaning some percentage of its commited memory won't often be used.
 
 
Note that due to swapping, even this has limited ef
Even this isn't all too bad, due to
 
-->
 
==Slab allocation==
{{stub}}
 
 
The slab allocator does caches of fixed-size objects.
 
Slab allocation is often used in kernel modules/drivers that are perfectly fine to allocate only uniform-sized and potentially short-lived structures - think task structures, filesystem internals, network buffers.
 
Fixed size, and often separated for each specific type, makes it easier to write an allocator that guarantees allocation within very small timeframe (by avoiding "hey let me look at RAM and all the allocations currently in there" - you can keep track of slots being taken or not with a simple bitmask, and it ''cannot'' fragment).
 
There may also be arbitrary allocation not for specific data structures but for fixed sizes like 4K, 8K, 32K, 64K, 128K, etc, used for things that have known bounds but not precise sizes, for similar lower-time-overhead allocation at the cost of some wasted RAM.
 
 
Upsides:
: Each such cache is easy to handle
: avoids fragmentation because all holes are of the same size,
:: that the otherwise-typical [https://en.wikipedia.org/wiki/Buddy_memory_allocation buddy system] still has
: making slab allocation/free simpler, and thereby a little faster
: easier to fit them to hardware caches better
 
Limits:
: It still deals with the page allocator under the cover, so deallocation patterns can still mean that pages for the same cache become sparsely filled - which wastes space.
 
 
SLAB, SLOB, SLUB:
* SLOB: K&R allocator (1991-1999), aims to allocate as compactly as possible. But fragments faster than various others.
* SLAB: Solaris type allocator (1999-2008), as cache-friendly as possible.
* SLUB: Unqueued allocator (2008-today): Execution-time friendly, not always as cache friendly, does defragmentation (mostly just of pages with few objects)
 
 
For some indication of what's happening, look at {{inlinecode|slabtop}} and {{inlinecode|slabinfo}}
 
See also:
* http://www.secretmango.com/jimb/Whitepapers/slabs/slab.html
* https://linux-mm.org/PageAllocation
 
 
 
There are some similar higher-level allocators "I will handle things of the same type" allocation,
from some custom allocators in C,
to object allocators in certain languages,
arguably even just the implementation of certain data structures.
 
=Memory mapped IO and files=
{{stub}}
 
Note that
: memory mapped IO is a hardware-level construction, while
: memory mapped files are a software construction (...because files are).
 
 
 
===Memory mapped files===
 
Memory mapping of files is a technique (OS feature, system call) that pretends a file is accessible at some address in memory.
 
When the process accesses those memory locations, the OS will scramble for the actual contents from disk.
 
Whether this will then be cached depends a little on the OS and details{{verify}}.
 
 
====For caching====
 
In e.g. linux you get that interaction with the page cache, and the data is and stays cached as long as there is RAM for it.
 
 
This can also save memory - in that ''without'' memory mapping,
compared to the easy choice of manually cacheing the entire thing in your process.
 
With mmap you may cache only the parts you use, and if multiple processes want this file, you may avoid a little duplication.
 
 
The fact that the OS can flush most or all of this data can be seen as a limitation or a feature - it's not always predictable, but it does mean you can deal with large data sets without having to think about very large allocations, and how those aren't nice to other apps.
 
 
 
====shared memory via memory mapped files====
 
Most kernel implementations allow multiple processes to mmap the same file -- which effectively shares memory, and probably one of the simplest in a [http://en.wikipedia.org/wiki/Protected_mode protected mode] system.
{{comment|(Some methods of [[Inter-Process communication]] work via mmapping)}}
 
 
Not clobbering each other's memory is still something you need to do yourself.
 
The implementation, limitations, and method of use varies per OS / kernel.
 
Often relies on [http://en.wikipedia.org/wiki/Demand_paging demand paging] to work.
 
===Memory mapped IO===
 
Map devices into memory space (statically or dynamically), meaning that memory accesses to those areas are actually backed by IO accesses (...that you can typically also do directly).
 
This mapping is made and resolved at hardware-level thing, and only works for [[DMA]]-capable devices (which is many).
 
It seems to often be done to have a simple generic interface {{verify}} - it means drivers and software can avoid many hardware-specific details.
 
 
See also:
* http://en.wikipedia.org/wiki/Memory-mapped_I/O
 
 
 
[[Category:Programming]]
 
===DMA===
{{stub}}
 
Direct Memory Access comes down to additional hardware that can be programmed to copy bytes from one memory address to another,
meaning the CPU doesn't have to do this.
 
DMA is independent enough at hardware level that its transfers can work at high clocks and throughputs (and without interrupting other work), comparable to CPU copies (CPU may be faster if it was otherwise idle. When CPU is not idle the extra context switching may slow things down and DMA may be relatively free. Details vary with specific designs, though).
 
 
They tend to work in smaller chunks, triggered by DRQ {{comment|(similar in concept to IRQs, but triggering only a smallish copy, rather than arbitrary code)}}, so that it can be coordinated as small chunks.
 
The details look intimidating at first, but mostly because they are low-level.
The idea is actually ''relatively'' simple.
 
 
 
Aside from memory-to-memory use, it also allows memory-to-peripheral copies (if a specific supporting device is [[memory mapped]]{{verify}}).
 
 
 
<!--
Presumably DMA hardware grew more generic over time, to be a more controllable subsystem
 
-->
 
 
<!--
 
Consider e.g. I2S output, which
: you just couldn't (easily) get to be regular without it being independent.
:
 
 
There's an interesting discussion of how this works in an STM32 microcontroller at http://cliffle.com/blog/pushing-pixels/
 
 
http://en.wikipedia.org/wiki/Direct_memory_access
-->
 
=Memory limits on 32-bit and 64-bit machines=
{{stub}}
 
 
tl;dr:
* If you want to use significantly more than 4GB of RAM, you want a 64-bit OS.
* ...and since that is now typical, most of the details below are irrelevant
 
 
 
TODO: the distinction between (effects from) physical and virtual memory addressing should be made clearer.
 
<!--
Physical memory addressing
* not so v
* is complicated by the device hole
 
Virtual memory:
* means per-process page tables (virtual-physical mapping, managed by the OS, consulted by the processor)
* means even the kernel and its helpers has to be mapped this way
* {{comment|(for stability/security reasons we want to protect the kernel from accesses, so)}} there is a kernel/user split - that is, the OS reserves a virtual address range (often at 3GB or 2GB)
 
Things neither of those directly draws in but do affect (and often vary by OS):
* memory mapping
* shared libraries
 
-->
 
===Overall factoids===
 
'''OS-level and hardware-level details:'''
 
From the '''I want my processes to map as much as possible''' angle:
* the amount of memory ''a single process'' could hope to map is typically limited by its pointer size, so ~4GB on 32-bit OS, 64-bit (lots) on a 64-bit OS.
:: Technically this could be entirely about the OS, but in reality this tied intimately to what the hardware natively does, because anything else would be ''slooow''.
 
* Most OS kernels have a split (for their own ease) that means that of the area a program can map, less is allocatable - to perhaps 3GB, 2GB sometimes even 1GB
: this is partly a pragmatic implementation detail from back when 32 ''mega''bytes was a ''lot'' of memory, and leftover ever since
 
 
* Since the OS is in charge of virtual memory, it ''can'' map each process to memory separately, so in theory you can host multiple 32-bit processes to ''together'' use more than 4GB
: ...even on 32-bit OSes: you can for example compile the 32-bit linux kernel to use up to 64GB this way
:: a 32-bit OS can only do this through '''PAE''', which has to be supported and enabled in motherboard, and supported and enabled in the OS.
:: Note: both 32-bit and 64-bit PAE-supporting motherboards ''may'' have somewhat strange limitations, e.g. the amount of memory they will actually allow/support {{comment|(mostly a problem in early PAE motherboards)}}
:: and PAE was problematic anyway - it's a nasty hack in nature, and e.g. drivers ''had'' to support it. It was eventually disabled in consumer windows (XP) for this reason. In the end it was mostly seen in servers, where the details were easier to oversee.
 
 
* device memory maps would take mappable memory away from within each process, which for 32-bit OSes would often mean that you couldn't use all of that installed 4GB
 
 
 
 
'''On 32-bit systems:'''
 
Process-level details:
* No ''single'' 32-bit process can ever map more than 4GB as addresses are 32-bit byte-addressing things.
 
* A process's address space has reserved parts, to map things like shared libraries, which means a single app can actually ''allocate'' less (often by at most a few hundred MBs) than what it can map{{verify}}. Usually no more than ~3GB can be allocated, sometimes less.
 
 
 
'''On 64-bit systems:'''
* none of the potentially annoying limitations that 32-bit systems have apply
: (assuming you are using a 64-bit OS, and not a 32-bit OS on a 64-bit system).
 
* The architecture lets you map 64-bit addresses
: ...in theory, anyway. The instruction set is set up for 64 bit everything, but the current x86-64 CPU implementation's address lines are 48-bit (for 256TiB), mainly because we can increase that later without breaking compatibility, and right now it saves copper and silicon 99% of computers won't use
: ...because in practice it's still more than you can currently physically put in most systems. {{comment|(there are a few supercomputers for which this matters, but arguably even there it's not so important because horizontal scaling is ''generally'' more useful than vertical scaling. But there are also a few architectures designed with a larger-than-64-bit addressing space)}}
 
 
On both 32-bit (PAE) and 64-bit systems:
* Your motherboard may have assumptions/limitations that impose some lower limits than the theoretical one.
 
* Some OSes may artificially impose limits (particularly the more basic versions of Vista seem to do this{{verify}})
 
 
 
Windows-specific limitations:
* 32-bit Windows XP (since SP2) gives you '''no PAE memory benefits'''. You may still be using the PAE version of the kernel if you have DEP enabled (no-execute page protection) since that requires PAE to work{{verify}}, but PAE's memory upsides are '''disabled''' {{comment|(to avoid problems with certain buggy PAE-unaware drivers, possibly for other reasons)}}
 
* 64-bit Windows XP: ?
 
* /3GB switch moves the user/kernel split, but a single process to map more than 2GB must be 3GB aware
 
* Vista: different versions have memory limits that seem to be purely artificial (8GB, 16GB, 32GB, etc.) {{comment|(almost certainly out of market segregation)}}
 
===Longer story / more background information===
A 32-bit machine implies memory addresses are 32-bit, as is the memory address bus to go along. It's more complex, but the net effect is still that you can ask for 2^32 bytes of memory at byte resolution, so technically allows you to access up to 4GB.
 
 
The 'but' you hear coming is that 4GB of address space doesn't mean 4GB of memory use.
 
 
 
====The device hole (32-bit setup)====
One of the reasons the limit actually lies lower is devices. The top of the 4GB memory space (usually directly under the 4GB position) is used to map devices.
 
If you have close to 4GB of memory, this means part of your memory is still not addressible by the CPU, and effectively missing.
The size of this hole depends on the actual devices, chipset, BIOS configuration, and more{{verify}}.
 
 
The BIOS settles the memory address map{{verify}}, and you can inspect the effective map {{comment|(Device Manager in windows, /proc/iomem in linux)}} in case you want to know whether it's hardware actively using the space {{comment|(The hungriest devices tend to be video cards - at the time having two 768MB nVidia 8800s in SLI was one of the worst cases)}} or whether your motherboard just doesn't support more than, say, 3GB at all.
Both these things can be the reason some people report seeing as little as 2.5GB out of 4GB you plugged in.
 
 
This problem goes away once you run a 64-bit OS on a 64-bit processor -- though there were some earlier motherboards that still had old-style addressing leftovers and hence some issues.
 
 
Note that the subset of these issues caused purely by limited address space on 32-bit systems could also be alleviated, using PAE:
 
====PAE====
It is very typical to use virtual memory systems.
While the prime upside is probably the isolation of memory, the fact that a memory map is kept for each process also means that on 32-bit, each application has its ''own'' 4GB memory map without interfering with anything else (virtual mapping practice allowing).
 
Which means that while each process could use 4GB at the very best, if the OS could see more memory, it might map distinct 4GBs to each process so that ''collectively'' you can use more than 4GB (or just your full 4GB even with device holes).
 
 
Physical Address Extension is a memory mapping extension (not a hack, as some people think) that does roughly that.
PAE needs specific OS support, but ''doesn't'' need to break the 32-bit model as applications see it.
 
It allowed mapping 32-bit virtual memory into the 36 bit hardware address space, which allows for 64GB {{comment|(though most motherboards had a lower limit)}}
 
 
PAE implies some extra work on each memory operation, but because there's hardware support it only kicked a few percent off memory access speed.
 
 
All newish linux and windows version support PAE, at least technically.
However:
* The CPU isn't the only thing that accesses memory. Although many descriptions I've read seem kludgey, I easily believe that any device driver that does DMA and is not aware of PAE may break things -- such drivers are broken in that they are not PAE-aware - they do not know the 64-bit pointers that are used internally used should be limited to 36-bit use.
* PAE was '''disabled''' in WinXP's SP2 to increase stability related to such issues, while server windowses are less likely to have problems since they use tend to use more standard hardware and thereby drivers.
 
====Kernel/user split====
{{stub}}
The kernel/user split, specific to 32-bit OSes, refers to an OS-enforced formalism splitting the mappable process space between kernel and each process.
 
 
It looks like windows by default gives 2GB to both, while (modern) linuces apparently split into 1GB kernel, 3GB application by default {{comment|(which is apparently rather tight on AGP and a few other things)}}.
 
(Note: '3GB for apps' means that any ''single'' process is limited to map 3GB. Multiple processes may sum up to whatever space you have free.)
 
 
In practice you may want to shift the split, particularly in Windows since almost everything that would want >2GB memory runs in user space - mostly databases.
{{comment|The exception is Terminal Services (Remote Desktop), that seems to be kernel space.}}
 
It seems that:
* linuxes tend to allow 1/3, 2/2 and 3/1,
* BSDs allow the split to be set to whatever you want{{verify}}.
* It seems{{verify}} windows can only shift its default 2/2 to the split to 1GB kernel, 3GB application, using the /3GB boot option {{comment|(the feature is somewhat confusingly called 4GT)}}, but it seems that windows applications are normally compiled with the 2/2 assumption and will not be helped unless coded to. Exceptions seem to primarily include database servers.
* You may be able to work around it with a 4G/4G split patch, combined with PAE - with some overhead.
 
===See also===
* http://www.dansdata.com/askdan00015.htm
 
* http://linux-mm.org/HighMemory
* [http://www-128.ibm.com/developerworks/linux/library/l-memmod/ Explore the Linux memory mode]
 
* http://www.spack.org/wiki/LinuxRamLimits
 
* http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory
 
* http://kerneltrap.org/node/2450
 
* http://en.wikipedia.org/wiki/3_GB_barrier
 
<!--
==Motherboards==
 
Integrated graphics used to mean a chip on the motherboard (in or near the northbridge{{verify}}).
This often meant the cheapest option a motherboard manufacturer could find, which is nice and minimal if you have no needs beyond office stuff and a bit of web browsing, but not enough for any shiny graphics worth staring at.
It also ate some of your CPU, main memory.
 
More recent integrated graphics is actually inside the CPU, and seem to be more like entry-level graphics cards and can play more games than the motherboard integrated graphics could.
Also, the implications that there are very few options also means they are much clearer options.
 
 
Gamers will always want a more serious video card - typically even something costing a few dozen bucks will be nicer.
 
 
 
===PCI, PCI-Express===
 
 
PCI Express (PCIe) is an already-common standard designed to replace the older PCI, PCI-X, and AGP standards {{comment|(PCI-X is PCI-eXtended, which was a variant on PCI, largely seen on servers, before there was PCIe. AGP was mostly used for video cards)}}
 
PCI was enough for most low-bandwidth cards, but started creaking for some applications a while ago (video capture, gbit ethernet, and such).
 
 
PCIe means more bandwidth, and is less of a single shared bus and more of a point to point thing (and theoretically more of a switched network thing{{verify}}), and is also symmetric and full duplex (can do the speed in both directions, and at the same time).
 
 
: '''On PCIe speeds speeds and slots'''
 
The slot basically consists of a small chunk of power (and simple management bus stuff), a bit of plastic, and the rest being the data lanes. You can eyeball what sort of slot you have by the size of the lane part.
 
The common slots:
* x1 (250MB/sec on PCIe 1.x) already much faster than PCI, and fast enough for many things.
* x4 (500MB/sec on PCIe 1.x) used by some higher-speed devices (e.g. multi-port GBit controllers), some RAID controllers, and such
* x16 (4GB/sec on PCIe 1.x) is used by video cards, some RAID controllers, and such
* (x2, x8 and x32 exist, but are not seen very often)
 
You can always plug PCIe cards into a larger slots.
 
* Speeds can refer both to a slot (its size is largely dictated by its lanes{{verify}}), and the speed that it can do.
** Which isn't always the same. There are e.g. motherboards with x16 slots that only do x8 speeds. ...for example because x16 was faster than most CPU and memory bus speeds at the time of introduction, which would make true x16 a waste of your money.
 
 
PCIe specs actually mention gigatransfers/sec. Given byte lanes, and assuming [http://en.wikipedia.org/wiki/8b/10b_encoding 8b/10b] coding, this means dividing the GT/s figure by 10 to get MByte/s.
The speeds mentioned above are for PCIe 1, which can do 2.5 GT/s per lane.
For comparison:
* v1.x: 250 MByte/s/lane (2.5 GT/s/lane)
* v2.x: 500 MByte/s/lane (5 GT/s/lane)
* v3.0: 1 GByte/s/lane (8 GT/s/lane)
* v4.0: 2 GByte/s/lane (16 GT/s/lane)
 
Note that both device and motherboard need to support the higher PCIe variant to actually use these speeds.
-->
 
 
 
=Some understanding of memory hardware=
 
[https://people.freebsd.org/~lstewart/articles/cpumemory.pdf "What Every Programmer Should Know About Memory"] is a good overview of memory architectures, RAM types, reasons bandwidth and access speeds vary.
 
 
 
==RAM types==
 
'''DRAM''' - Dynamic RAM
: lower component count per cell than most (transistor+capacitor mainly), so high-density and cheaper
: yet capacitor leakage means this has to be refreshed regularly, meaning a DRAM controller, more complexity and higher latency than some
: (...which can be alleviated and is less of an issue when you have multiple chips)
: this or a variant is typical as main RAM, due to low cost per bit
 
 
'''SDRAM''' - Synchronous DRAM - is mostly a practical design consideration
: ...that of coordinating the DRAM via an external clock signal (previous DRAM was asynchronous, manipulating state as soon as lines changed)
: This allows the interface to that RAM to be a predictable state machine, which allows easier buffering, and easier interleaving of internal banks
: ...and thereby higher data rates (though not necessarily lower latency)
: SDR/DDR:
:: DDR doubled busrate by widening the (minimum) units they read/write (double that of SDR), which they can do from single DRAM bank{{verify}}
:: similarly, DDR2 is 4x larger units than SDR and DDR3 is 8x larger units than SDR
:: DDR4 uses the same width as DDR3, instead doubling the busrate by interleaving from banks
:: unrelated to latency, it's just that the bus frequency also increased over time.
 
 
'''Graphics RAM''' refers to varied specialized
: Earlier versions would e.g. allow reads and writes (almost) in parallel, making for lower-latency framebuffers
: "GDDR" is a somwhat specialized form of DDR SDRAM
 
 
 
 
'''SRAM''' - Static RAM
: Has a higher component count per cell (6 transistors) than e.g. DRAM
: Retains state as long as power is applied to the chip, no need for refresh, also making it a little lower-latency
: no external controller, so simpler to use
: e.g used in caches, due to speed, and acceptable cost for lower amounts
 
 
'''PSRAM''' - PseudoStatic RAM
: A tradeoff somewhere between SRAM and DRAM
: in that it's DRAM with built-in refresh, so functionally it's as standalone as SRAM and slower but you can have a bunch more of it for the same price - e.g. SRAM tends to
: (yes, DRAM can have built-in refresh, but that's often points a ''sleep'' mode that retains state without requiring an active DRAM controller)
 
 
 
 
<!--
'''Non-volatile RAM'''
 
The concept of Random Access Memory (RAM) '''only''' tells you that you can access any part of it with similar ease (contasted with e.g. tape storage, where more distance meant more time, so more storage meant more time).
 
Yet we tend to think about RAM as volatile, as entirely temporary scratchpad, only useful as an intermediate between storage and use.
This is perhaps because the simplest designs (and thereby cheapest per byte) have that property.
For example, DRAM loses its charge and has to be constantly and actively refreshed, DRAM and SRAM and many others lose their state once you remove power.
(There are also exceptions and inbetweeens, like DRAM that doesn't need its own controller and can be told to refresh itself in a low-power mode, acting a whole lot like SRAM).
 
 
Yet there are various designs that are both easily accessible ''and'' keep their state.
 
It's actually a gliding scale of various properties.
We may well call it NVM (non-volatile memory), when grouping a lot of them and don't yet care about further properties - like how often we may read or write or how difficult that is. Say, some variants of EEPROM aren't the easiest to deal with, and consider that Flash, now very common and quite convenient, is a development from EEPROM.
 
When we talk about NVRAM rather than NVM when we are often pointing at more specific designs,
often where we can fairly easily use it and it happens to stick around,
like in FRAM, MRAM, and PRAM, or nvSRAM or even BBSRAM.
 
 
FRAM - Ferroelectric RAM, which resembles DRAM but uses a ferroelectric material,
: easier to access than Flash
: seems to have a read limit rather than a write limit?, but that limit is also something like 1E14 and you are ''unlikely'' to use it so intensely to reach that any time soon.
: so it's great for things like constant logging, which would be terrible for Flash
https://electronics.stackexchange.com/questions/58297/whats-the-catch-with-fram
 
 
 
nvSRAM - SRAM and EEPROM stuck on the same chip.
: https://en.wikipedia.org/wiki/NvSRAM
 
 
BBSRAM - Battery Backed SRAM
: basically just SRAM ''alongside'' a lithium battery
: feels like cheating, but usefully so.
 
-->
 
 
 
===Memory stick types===
{{stub}}
 
 
'''ECC RAM'''
: can detect many (and correct some) hardware errors in RAM
: The rate of of bit-flips is low, but will happen. If your computations or data are very important to you, you want ECC.
: See also:
:: http://en.wikipedia.org/wiki/ECC_memory
:: {{search|DRAM Errors in the Wild: A Large-Scale Field Study}}
 
 
'''Registered RAM''' (sometimes '''buffered RAM''') basically places a buffer on the DRAM modules {{comment|(register as in [https://en.wikipedia.org/wiki/Hardware_register hardware register])}}
: offloads some electrical load from the main controller onto these buffers, making it easier to have designs more stably connect ''more'' individual memory sticks/chips.
: ...at a small latency hit
: typical in servers, because they can accept more sticks
: Must be supported by the memory controller, which means it is a motherboard design choice to go for registered RAM or not
: pricier (more electronics, fewer units sold)
: because of this correlation with server use, most registered RAM is specifically registered ECC RAM
:: yet there is also unregistered ECC, and registered non-ECC, which can be good options on specific designs of simpler servers and beefy workstations.
: sometimes called RDIMM -- in the same context UDIMM is used to refer to unbuffered
: https://en.wikipedia.org/wiki/Registered_memory
 
'''FB-DIMM''', Fully Buffered DIMM
: same intent as registered RAM - more stable sticks on one controller
: the buffer is now ''between'' stick and controller [https://en.wikipedia.org/wiki/Fully_Buffered_DIMM#Technology] rather than on the stick
: physically different pinout/notching
 
 
'''SO-DIMM''' (Small Outline DIMM)
: Physically more compact. Used in laptops, some networking hardware, some Mini-ITX
 
 
EPP and XMP (Enhanced Performance Profile, Extreme Memory Profiles)
: basically, one-click overclocking for RAM, by storing overclocked timing profiles
: so you can configure faster timings (and V<sub>dimm</sub> and such) according to the modules, rather than your trial and error
: normally, memory timing is configured according to a table in the [https://en.wikipedia.org/wiki/Serial_presence_detect SPD], which are JEDEC-approved ratings and typically conservative.
: EPP and XMP basically means running them as fast as they could go (and typically higher voltage)
 
 
 
 
On pin count
: SO-DIMM tends to have a different pin count
: e.g. DDR3 has 240 pins, DDR3 SO-DIMM has 204
: e.g. DDR4 has 288 pins, DDR4 SO-DIMM has 260
: Registered RAM has the same pin count
: ECC RAM has the same pin count
 
 
In any case, the type of memory must be supported by the memory controller
: DDR2/3/4 - physically won't fit
: Note that while some controllers (e.g. those in CPUs) support two generations, a motherboard will typically have just one type of memory socket
: registered or not
: ECC or not
 
Historically, RAM controllers were a thing on the motherboard near the CPU, while there are now various cases where the controller is on the CPU.
 
==More on DRAM versus SRAM==
{{stub}}
 
<!--
'''Dynamic RAM (DRAM)''' cells are a transistor and capacitor, much simpler than various other types of RAM.
 
The transistor controls
: writes (set level on the data line, then raise cell access line long enough for charge/discharge to that level)
: and reads (raise the access line for discharge into the data line, which has something that senses whether there was charge).
 
That means reads are slowish, but more importantly, reads are destructive of the row it is in, and there is a mechanism to store it back.
 
 
Separately, capacitors slowly leak charge anyway (related to closeby cells, related to their bulk addressing, and note that the higher the memory density, the smaller the capacitor so the sooner this all happens), so DRAM only makes sense with refresh: when there is something going through reading every cell and writing it back, ''just'' to keep the state over time.
 
The DRAM controller will refresh each DRAM row within (typically) 64ms, and there are order of thousands tens of thousands of them in a DRAM chip.
 
 
Yes, this means you randomly incur some extra latency.
 
Larger chips effectively have longer refresh overhead.
 
Each chip is slower-than-ideal, which can be made irrelevant by having the same amount of RAM in more chips on a memory stick. (Seems to also part of why servers often have more slots{{verify}})
 
 
It also means DRAM will require more power than most others, even when it's not being used.
 
 
With all these footnotes, DRAM seems clunky, so why use it?
 
Mainly because it's rather cheaper per bit (even with economy of scale in production),
and as mentioned, you can alleviate the performance part fairly easily.
 
 
 
 
The first thing you'ld compare DRAM to is often '''SRAM (Static RAM)''', or some variant of it.
 
SRAM cells are more complex per bit, but don't need refresh,
are fundamentally lower-latency than DRAM, and take less power when idle.
 
(with some variation; lower speed SRAM can be low power, whereas at high speeds and use power can can be comparable to DRAM)
 
 
The main downside is that due to their complexity, they are lower density, cost more silicon (and therefore money) per bit.
 
There are a lot of high-speed cases, or devices, where a little SRAM makes a lot of sense, like network switches,
and also L1, L2, and L3 caches in your computer.
 
 
SRAM is electrically easier to access (also means you need less of a separate controller),
so simple microcontrollers may prefer it, also because it's easier to embed on the same IC.
 
Since SRAM uses noticeably more silicon than DRAM per cell, SRAM is often under a few hundred kilobyte - in part because you'ld probably use SRAM for important bits, alongside DRAM for bulkier storage.
 
 
----
 
'''Pseudostatic RAM (PSRAM, a.k.a. PSDRAM)''' are ICs that contains both DRAM and a controller, so has DRAM speeds, but are as easy to use as SRAM, and a price somewhere inbetween.
 
 
There are even variants that are basically DRAM with an SRAM cache in front
so that well controlled access patterns can be quite fast.
 
----
 
More DRAM notes:
 
For a few reasons (including that there are a ''lot'' of bits in the address, to save dozens of pins as well as silicon on internal demultiplexing), DRAM is typically laid out as a grid, and the address is essentially sent in two parts, the row and the column, sent one after the other.
 
This is what RAS and CAS are about - the first is a strobe that signals the row address can be used, the second that the column can be.
 
And, because capacitors are not instant, there needs to be some time between RAS and CAS, and between CAS and data coming out. This, and other details (e.g. precharge) are a property of the particular hardware, and should be adhered to to be used reliably.
 
Setting these parameters would be annoying, so on DDR DRAM sticks there is a small chip[https://en.wikipedia.org/wiki/Serial_presence_detect] that tells the BIOS you the timing options.
 
----
 
DRAM is also so dense that it has led to some electrical issues, e.g. the [https://en.wikipedia.org/wiki/Row_hammer row hammer] exploit.
 
----
 
 
Because you still spend quite bit of time on addressing, before the somewhat-faster readout of data,
a lot of DRAM systems do prefetch/burst (what you'd call readahead in disks).
 
That is, instead of fetching a cell, it fetches (burst_length*bus_width), with burst_length apparently linked to DDR type, but 64 bytes for DDR3 and DDR4. (also because that's a common CPU cache line size)
 
This is essentially a forced locality assumption, but it's relatively cheap and frequently useful.
 
 
 
 
 
"RAS Mode"
: lockstep mode, 1:1 ratio to DRAM clock
:: more reliable
 
: independent channel mode, a.k.a. performance mode, 2:1 to DRAM clock
:: more throughput
:: also allows more total DIMMs (if your motherboard is populated with them)
: mirror - seems to actually refer to memory mirroring.
 
Note this is about the channels, not the DRAM.
 
https://www.dell.com/support/article/nl/nl/nlbsdt1/sln155709/memory-modes-in-dual-processor-11th-generation-poweredge-servers?lang=en#Optimizer
 
https://software.intel.com/en-us/blogs/2014/07/11/independent-channel-vs-lockstep-mode-drive-you-memory-faster-or-safer
 
 
 
 
 
 
In PCs, the evolution from SDR SDRAM to DDR SDRAM to DDR2 SDRAM to DDR3 SDRAM is a fairly simple one.
 
SDR ():
* single pumped (one transfer per clocktick)
* 64-bit bus
* speed is  8 bytes per transfer * memory bus rate
 
DDR (1998):
* double pumped (two transfers per clocktick, using both the rising and falling edge)
* 64-bit bus
* speed is  8 bytes per transfer * 2 * memory bus rate
 
DDR2 (~2003):
* double pumped
* 64-bit bus {{verify}}
* effective bus to memory is clocked at twice the memory speed
* No latency reduction over DDR (at the same speed) {{verify}}
* speed is  8 bytes per transfer * 2 * 2 * memory bus rate
 
DDR3 (~2007):
* double pumped
* 64-bit bus {{verify}}
* effective bus to memory is clocked at four times the memory speed
* No latency reduction over DDR2 (at the same speed) {{verify}}
* speed is  8 bytes per transfer * 2 * 4 * memory bus rate
 
DDR4 (~2014)
* double pumped
 
DDR5 (~2020)
*
 
Each generation also lowers voltage and thereby power (per byte).
 
 
(Note: Quad pumping exists, but is only really used in CPUs)
 
 
The point of clocking the memory bus higher than the speed of individual memory cells
is that as long as you are accessing data from two distinctly accessed cells,
you can send both on the faster external (memory) bus. {{verify}}
 
It won't be twice, but depending on access patterns might sometimes get close{{verify}}.
 
 
Dual channel memory is different yet - it refers to using an additional 64-bit bus to memory in addition to the first 64 bits, so that you can theorhetically transfer at twice the speed.
The effect this has on everyday usage depends a lot on what that use is, though.
It seems that even the average case is not too noticeable improvement.
 
 
 
so four bits of data can be transferred per memory cell cycle. Thus, without changing the memory cells themselves, DDR2 can effectively operate at twice the data rate of DDR.
 
 
 
Within a type (within SDR, within DDR, within DDR2, etc.), the different speeds do not point to different design. Like with CPUs, it just means that the memory will work under that speed. Cheap memory may fail if clocked even just a littl higher, while much more tolerant memory also exists, which is interesting for overclockers.
 
Note that the bus speed a particular piece of memory will work under depends on how c
 
 
 
 
 
Transfers per clocktick:
* 1 for SDR/basic SDRAM
* 2 for DDR SDRAM
* 4 for DDR2 SDRAM
* 8 for DDR3 SDRAM
* DDR4
* DDR5
 
 
 
SDRAM was available as:
66MHz
100MHZ
133 MHz
 
DDR used to have some commonly used aliases, e.g.
alias    standard name  speed
PC-1600  DDR-200        100MHz, 200Mtransfers/s, peak 1.6GB/s
PC-2100  DDR-266        133MHz, 266Mtransfers/s, peak 2.1GB/s
PC-2700  DDR-333        166MHz, 333Mtransfers/s, peak 2.7GB/s
PC-3200  DDR-400        200MHz, 400Mtransfers/s, peak 3.2GB/s
 
As of right now (late 2008), DDR3 is not it money/performancewise, but DDR2 is interesting over DDR.
 
PC2-3200  DDR2-400        100MHz, 400Mtransfers/s, peak 3.2GB/s
PC2-4200  DDR2-533        133MHz, 533Mtransfers/s, peak 4.2GB/s
PC2-5400  DDR2-667        166MHz, 667Mtransfers/s, peak 5.4GB/s
PC2-6400  DDR2-800        200MHz, 800Mtransfers/s, peak 6.4GB/s
 
-->
 
==On ECC==
<!--
 
Like disks, RAM has an error rate, for a few reasons [http://arstechnica.com/business/2009/10/dram-study-turns-assumptions-about-errors-upside-down/] [http://www.cnet.com/news/google-computer-memory-flakier-than-expected/] [http://www.zdnet.com/blog/storage/dram-error-rates-nightmare-on-dimm-street/638].
 
It's tiny, but it's there, and there's a fix via error correction methods.
These can typically fix the error if it's just one bit. {{comment|(Two bits at a time, which can be detected but not fixed, are rare unless you've got a faulty stick, or are overclocking it to the point of instability)}}
 
 
On servers there is more interest in not having a weak link that can flip some bits without you noticing,
and send a bad version of the data to disk, or generating them during lots of hard calculation.
So it's typically used in storage servers, clusters, and possibly everything important enough to
be housed in a server room, in particular when its admin want to sleep a little less nervously.
 
On workstations you may care less.
There is decent chance errors occur in unused areas, program code programs doing nothing of long-term consequence,
or data that is read but will not be written to disk. Programs may crash but the system may not. Video may merely glitch.
 
In computing clusters, it may be entirely viable (and a good idea anyway) to double-check results
and just redo the tiny part of a job that doesn't make sense.
 
 
For devices that store data you really care about, consider ECC. The tradeoffs are actually more complex,
in that there are other parts of the whole that can make mistakes for you, so this ECC is just about removing ''one'' weak link,
while you are still leaving others.
 
In theory, whenever you're ''not'' altering an authoritative store, ECC is less important.
 
 
 
Intel has a weird separation in that ECC support is ''disabled'' in consumer CPUs,
apparently to entice businesses to buy Xeons with their ECC combination.
 
If you're setting up a storage server at home,
or otherwise care about a few hundred dollar difference,
then an ECC-capable motherboard+Xeon+ECC RAM is a pricy combination,
so it's common enough to go AMD instead,
simply because it's more flexible in its combinations.
 
 
-->
<!--
 
 
 
 
-->
 
==Buffered/registered RAM==
<!--
This is mainly a detail of motherboard (and CPU) design.
 
Registered RAM places less electrical load on the controller than,
meaning you can stuff more slots/sticks on the same motherboard.
 
 
The buffer/register refers to the part stuck inbetween,
which also makes it slightly slower.
 
Buffered RAM is mostly interesting for servers that ''must'' have a lot of RAM.
 
http://en.wikipedia.org/wiki/Registered_memory
 
 
-->
 
==EPROM, EEPROM, and variants==
 
PROM is Programmable ROM
: can be written exactly once
 
EPROM is Erasable Programmable ROM.
: often implies UV-EEPROM, erased with UV shone through a quartz window.
 
EEPROM's extra E means Electrically Eresable
: meaning it's now a command.
: early EEPROM read, wrote, and erased {{verify}} a single byte at a time. Modern EEPROM can work in alrger chunks.
: you only get a limited amount of erases (much like Flash. Flash is arguably just an evolution of EEPROM)
 
 
<!--
 
EEPROMs tend to erase in chunks.
 
There is typically no separate erase, erase happens transparently on writes - it reads a page into its tiny RAM, erases, and writes the whole thing back.
 
This means it has a [[write hole]],
and erases faster than you may think.
 
That said, the erase count is relatively high for something you may not consider to be continuously alterable storage.
 
 
 
-->
 
==Flash memory (intro)==
{{stub}}
<!--
 
Flash is a type of EEPROM, and a refinement on previous 'plain' EEPROM.
The name came from marketing, helping to distinguish it as its own thing with its own properties.
 
Like EEPROM, Flash is non-volatile, erases somewhat slowly, and has a limited number of erase cycles.
 
 
One difference is making it erasable in chunks (smaller than erase-fully variants, larger than erase-bytes variants),
a tradeoff that helps speed, cost, and use for random-storage needs.
 
 
Simpler memory cards, and simpler USB sticks, have one flash chip and a simple controller, which is the cheapest setup and why they don't tend to break 10MB/s,
and don't have enough wear leveling to last very long.
 
SSDs go faster
partly because they parallelize to more chips (RAID-like layout), and
partly because of an extra layer of management that (in most practical use) hides Flash's relatively slow erase speed (by doing it at other times, not when it's needed, a plan that usually works but may not under heavy load).
 
-->
 
<!--
 
There are two types of Flash, NOR and NAND, named for the cells resembling (and working like) classical logic gates.
 
Very roughly,
NOR is faster but more expensive so has a few specialist uses,
NAND is denser, slower, and takes less power, is typically more useful and cheaper for bulk storage.
 
 
In terms of storage area, Flash is denser than platter - but that's only true if we don't count the size of IC packages.
In practice the physical overhead of both makes them comparable.
 
 
 
In comparison:
* Reading is three orders of magnitude slower than DRAM
* Writing is four orders of magnitude slower than DRAM
 
 
 
'''Flash data retention (active use)'''
 
Active use of cells wears the semiconductor, and lowers the ability to stably retain charge.
This is expressed in the amount of erases it will take - which differ between SLC, TLC, MLC.
 
It's a bad idea for SSDs to juse use it until you can no longer read it, because that's just
storage failure. As such, they try to be pessimistic/conservative.
 
USB sticks and memory cards ''are'' often a bit tralala about it.
 
 
 
'''Flash data retention (idle shelf time)'''
 
State is kept as charge, so you may ask "how non-volatile is non-volatile?"
 
The short answer is that flash is not meant for archival purposes.
 
Flash producers often give a spec of ten years.
But that's a rated value, and with some assumptions.
Including extrapolation, as there are no long-term studies on NAND{{verify}}.
 
This also differs per type (TLC, MLC, SLC) because closer voltage levels imply
the same decay makes more difference.
 
 
There are some people that say just powering up flash will make its controller refresh the data.
AFAICT you this is generally not true - you should ''never'' assume this on memory cards and USB sticks.
It ''may'' be true for some SSDs{{verify}}, but unless you know your model does, don't bet your data on it.
 
In other words, if you want a fresh copy of your data, read it all off, and write it back.
{{comment|(...just as you should be doing on platter disks, and are also not doing)}}.
 
-->
 
==PRAM==
 
<!--
https://en.wikipedia.org/wiki/Phase-change_memory
 
-->
 
=Flash memory=
 
==Memory card types==
{{notes}}
 
 
For different kinds of memory cards, see [[Common plugs and connectors#Memory_cards]]
 
 
===Secure Digital (SD, miniSD, microSD), and MMC details===
{{stub}}
 
 
'''Capacity types / families'''
 
* SD (now named SDSC, 'standard capacity', to distinguish it)
** size (somewhat artificially) limited to 1-4GB
 
* SDHC (high capacity), since approx 2006
** physically identical but conforming to a new standard that allows for higher capacity and speed.
** adressing limited to 32GB
 
* SDXC (eXtended-Capacity), since approx 2009
** successor to SDHC that allows for higher capacity and speed
** UHS was introduced since roughly then. Note that only ''some'' cards use UHS.
** adressing limited to 2TB
 
* Ultra-Capacity (SDUC), since approx 2018
** limited to 128TB
 
* SDIO
** allows more arbitrary communication, basically a way to plug in specific accessories ''on supporting hosts'' - not really an arbitrarily usable bus for consumers
** (supports devices like GPS, wired and wireless networking)
 
 
 
 
The above is partly about capacity, and partly about function.
It's also not entirely aligned with SD versions,
protocolwise it's even more interesting,
particularly with the extra buses for the faster (UHS and Express) modes.
 
I think most people have list track of the details by now.
 
 
 
 
 
 
'''Electrically'''
 
Power is 3.3V, though there are some lower-voltage details - in particular the LVDS being lower voltage (1.8V{{verify}}).
 
(MMC had 7 pins)
 
SD has 9 pins before up until UHS-II
 
SD with UHS-II adds 8 pins for a total of 17 pins
: two more [https://en.wikipedia.org/wiki/Low-voltage_differential_signaling LVDS] pairs, and more power and ground
 
 
MicroSD has 8 pins <!--(which is one less than SD, but it's just one less ground pin)-->
 
MicroSD with UHS-II has 17 pins
 
 
 
 
'''Protocol (and DIY options)'''
 
Since there are a a few card types and more transfer modes over the years,
supporting all the possible things a card can do is fairly involved.
 
Even detecting what kind of card is there is interesting.
You'ld think this is part of negotiation, but for historical reasons
you need some fallback logic even in the initialisation commands.
 
 
Since you're talking to the flash controller, there is a minimal mode, namely to start talking SPI.
 
There are a handful of protocol variations, that are basically negotiated from the most basic one.
Any fancy device ''will'' want to do that for speed, but for DIY that choice of SPI is much simpler.
{{comment|(note there are some recent cards where SPI mode is optional, though{{verify}})}}
 
In SPI mode the pins are mostly just SPI's MOSI, MISO, SCLK, select, and ground, and Vcc.
 
Code-wise, you'll want to find a library. If you don't, you'll probably end up writing much of one anyway.
 
<!--
See e.g. https://electronics.stackexchange.com/questions/77417/what-is-the-correct-command-sequence-for-microsd-card-initialization-in-spi
 
{{comment|(Arduino's Sd2Card.cpp seems to follow that part of the spec well. Its type 1 and 2 refers to SDSC version 1.x and "2.0 or later". Larger capacity can be seen as part of "2 or later" or as their own thing{{verify}})}}
 
Frankly, you want to be using a library anyway,
because if you don't you'll probably end up writing much of one anyway.
 
Also, in DIY you generally won't get any of the faster modes, not without some planning.
 
 
 
MMC 1.0 ~ 3
 
MMC 4 to 4.3
 
MMC 4.4
 
 
SD version 1.0 ~ 1.01
 
SD version 1.1 ~ 2.0
: SDHC introduced in 2?
 
SD version 3
: SDXC introduced in 3?
 
 
https://www.sdcard.org/downloads/pls/index.html
 
http://uditagarwal.in/index.php/2018/03/17/understanding-sd-sdio-and-mmc-interface/
 
https://www.sdcard.org/downloads/pls/simplified_specs/archive/part1_301.pdf
 
http://elm-chan.org/docs/mmc/mmc_e.html
 
-->
<!--
 
'''Formatting'''
 
SD cards are often partitioned{{verify}},
typically MBR style, though are also seen partitionless using the entire block device for a filesystem.
 
Most tools will default to partition to a single FAT32 partition,
because that's best supported all around, but some things need more flexibility.
 
For example the typical setup for raspberry Pi has a mix of partitions, mostly for practical reasons.
 
 
Earlier SD cards often come formatted as FAT32 (and small ones occasionally FAT16 or FAT12 {{verify}}), larger and later variants as exFAT, for size reasons
 
There is no special status to this, it's just the likely default (and OS support) of the times in which they first appeared.
 
-->
 
====SD Speed rating====
 
 
 
'''Actual performance'''
 
 
There are '''two gotchas''' to speed ratings:
* due to the nature of flash, it will read faster than it will write.
: how much faster/slower depends, but it's easily a factor 2
: if marketers can get away with it, they will specify the read speed
: note that the differences vary, due to differences in controlles. E.g. external card readers tend to be cheap shit, though there are some examples of slow
 
* writes can be faster in short bursts.
: because you're actually talking to a storage controller, managing the flash
: You usually care about sustained average write instead
: And sometimes about the guaranteed speed, i.e. the minimum per second-or-so
 
 
 
[[File:SD speed labels.jpg|thumb|400px|right|This tells us it's a UHS-I card, it's video class 10, UHS class 1, and speed class 10]]
Marking-wise
* '''Speed class''' - looks like a circle with a number in it - one of 2, 4, 6, or 10
: (and a class 0, which doesn't specify performance so is meaningless)
: tyhat figure is which is MB/s
: apparently this was intended as a minimum sustained write speed, but practice proves not everyone keeps to this, so if specs look optimistic, they probably are. It seems to vary with honesty, so a good Class 6 card may well perform better than a bad Class 10 one.
: there is no larger-than-10, no matter how much fast it actually is
: those details (and the fact that these days most SD can sustain 10MB/s) means this class system is no longer informative
 
* '''Video speed class''' - V with a number, V6, V10, V30, V60, or V 90.
: again, it's just MB/s
: These were introduced because realtime-not-too-compressed HD video tends to want
:: perhaps 10MByte/s for 1080p
:: perhaps 30MByte/s for FHD
:: perhaps 60MByte/s for 4k
:: perhaps 90MByte/s for 8k
: These are apparently required to be sustained speeds{{verify}}
 
* '''UHS speed class''' - looks like a U with a number in it (1, 2, or 3)
: 1 is 10MB/s
: 3 is 30MB/s
: ...so UHS speed class has very little to do with UHS version
 
* Packaging may try to stunt with speed, but tend to say "up to" (possibly in tiny print)
: For example, I have a card that says writes <small>up to</small> 80MB/s and reads <small>up to</small> 170MB/s, yet all the logos on it suggest it can't guarantee sustaining more than 30MB/s. Curious...
: so assume this is marketing bullshit in general
 
<!--
* Application class -
: A1: Random read  at 1500 [[IOPS]], random write at 500 IOPS
: A2: Random read  at 4000 IOPS, random write at 2000 IOPS
-->
 
 
For video, you probably want 10MB/s for standard definition,  30MB/s for 1080p,  60MB/s for 4K,  and 90MB/s for 8K
 
 
 
<!--
Some use a an '''x rating'''
 
Units of 150kByte/sec (like CDs did).
 
For ''some'' numeric idea, though:
* 13x is ''roughly'' equivalent to class 2
* 40x is ''roughly'' equivalent to class 6
* 66x is ''roughly'' equivalent to class 10
 
* 300x is ~45MB/s
* 666x is ~100MB/s
 
Apparently these are '''not required to be sustained speed ''or'' write speed'''{{verify}}, or at least practice proves not everyone keeps to this, so if it looks dubious, it probably is.
 
-->
 
 
 
 
{{zzz|
'''Bus speed:'''
 
Bus speed is how much the wiring can carry data.
Note this says nothing about whether the card actually will, so this is mostly unimportant
 
Classically there's
* Standard
: 12MB/s max
 
* High-speed - clocks SDSC and SDHC at double the rate
: 25MB/s max
 
 
* UHS-I
:: Introduced in version 3.01 (~2010, basically around when SDXC was introduced)
:: for SDHC and SDXC
:: one LDVS pair on the same row, ''bus speed'' specced at ~100MB/s max
 
* UHS-II
:: Introduced in version 4.0 (~2011)
:: for SDHC and SDXC?
:: an extra row of eight pins: two extra LDVS pairs, and more power and ground)
:: ''bus speed'' specced at ~300MB/s max
 
* UHS-III is only part of SDUC{{verify}}
:: introduced in version 6.0
:: ''bus speed'' specced at ~600MB/s max
:: Also introduced "Video Speed Class" rating
: Express, which are primarily about extra speed --
 
* SD Express (introduced in version 7.0)
:: ''bus speed'' specced at ~900MB/s max
 
I'm still confused about how SD Express and UHS-III relate
}}
 
====Latency====
<!--
Read latency is often in the 1ms to 2ms range (with no hard correlation to speed class).
 
 
Write latency is not constant, because you're actually talking to a storage controller
that typically buffers your data, and is managing flash blocks.
 
Each card may act differently, and there is no direct relation to speed class.
 
Writing ~1K blocks
: you can assume at least 30ms on a longer term average
:: usually many smaller, and one much longer one every so often
: on a shorter term can take anywhere from 15ms to 200ms
:: or more, but variants of the specs say ~200, 250ms, 500ms max{{verify}}
: with some outliers around 4ms and 700ms
 
The same cards will tend to do better average latency for larger write sizes, up to some amount.
 
 
As such, things like fast continuous sensor logging
will have to be able to buffer ~0.5 second of sampling,
and sample parallel to the writing.
 
 
 
-->
<!--
 
https://jitter.company/blog/2019/07/31/microsd-performance-on-memory-constrained-devices/
-->
 
==On fake flash==
{{stub}}
 
Fake flash refers to a scam where cards's controller reports a larger size than the actual storage has.
 
 
These seem to come in roughly two variants:
: addressing storage that isn't there will fail,
: or it will wrap back on itself and write in existing area.
 
...which isn't an important distinction, in that the result is just that it appears to be broken.
It will seem to work for a little while, and in both cases it will corrupt later.
 
 
There are some tools to detect fake flash. You can e.g. read out what flash memory chips are in there and whether that adds up.
Scammers don't go so far to fake this.
 
But the more thorough check is a write-and-verify test, see below.
 
==Memory card health==
{{stub}}
 
While memory cards and USB sticks are flash memory much like SSDs,
most lack all wear leveling and health introspection.
 
So you should assume that they will fail completely, without warning.
 
 
At lower level, even if they are still completely readable (and that's not a guarantee),
filesystems are not made to deal with write failure, so you may need
special tools and/or a technical friend for recovery.
 
 
You can check whether it's still readable (non-destructive) with test consisting of "read all disk surface" {{comment|(for chkdsk it's the 'scan for an attempt recovery of bad sectors" checkbox)}}
 
The only real test of whether it's fully writable is to write to all of it (necessarily destructive).
But this only proves that it hasn't failed already, not that it won't soon.
 
 
One useful tool is {{search|H2testw}},
which creates a file in free space (if empty, then almost all the SD card)
 
It will also tell you ''actual'' average write and read speed, not the potential lie on the front.
 
And implicitly be a fake flash test.
 
==What's stored==
<!--
 
Embedded flash will often contain a bootloader, and store image types according.
 
e.g. on linux you'll see
(e.g. uBoot, looks for uImage, which itself is mostly just a wrapper, and created by mkimage)
SquashFS (for rootfs), JFFS2 (for storage), and others.
 
http://www.isysop.com/unpacking-and-repacking-u-boot-uimage-files/
 
 
And some special cases, such as that SPIFFS was used on [[ESP]] [https://github.com/pellepl/spiffs] [https://tttapa.github.io/ESP8266/Chap11%20-%20SPIFFS.html], later switched to LittleFS[https://arduino-esp8266.readthedocs.io/en/latest/filesystem.html].
 
 
 
Portable USB things will often be FAT32, NTFS, or exFAT
 
They are often unpartitioned.
 
You can partition it, but windows will only detect the first partition, and built-in windows tools won't be the one to create such a partitioning. (this seems to have changed in Win10)
 
It seems this is behaviour tied to it being marked as removable or not. But this is not recommended as it'll buffer a lot more and leave it much more likely to leave the filesystem in a bad state.
 
 
Linux will often treat it as any other disk.
 
 
-->
 
 
=History=
==Core memory==
 
<!--
 
'''ferrite-core memory''', a.k.a. '''magnetic-core memory''' and more often just called '''core memory''',
is the read-write variant where you magneture each donut.
 
https://en.wikipedia.org/wiki/Magnetic-core_memory
 
 
'''rope-core memory''' / '''core rope memory''' is the simpler, non-volatile and read-only form,
where the way you put wires through cores codes what you store.
 
https://en.wikipedia.org/wiki/Core_rope_memory
 
 
 
 
The Apollo program guidance computers used both - mostly rope core memory for ROM, and some ferrite-core for RAM.
In a big chunky ,
and with redundancy to not be bothered by [[cosmic rays]] so much.
 
 
Core memory was labour-intensive and therefore expensive, but from the 50s to 70s (when semiconductor memory became a thing)
the most viable way to get order of kilobytes of ROM, or or RAM.
-->
 
 
 
 
[[Category:Glossary]]
[[Category:Hardware]]

Latest revision as of 11:53, 10 July 2023