Virtual memory: Difference between revisions
mNo edit summary |
m (→Glossary) |
||
Line 992: | Line 992: | ||
--> | --> | ||
===Copy on write=== | |||
<!-- | |||
Copy on write allows multiple ''logical''ly distinct chunks to be backed by the same ''physical'' chunk. | |||
It's something you can do ''transparently'' when both allocation and access involve a layer of indirection that knows about this. | |||
: ...because that's the only way it won't be easily subverted. In particular, it means that a write won't accidentally alter data underlying multiple copies. | |||
: when a write happens, it will un-shared, a.k.a. get ts own physical copy (hence the name copy-on-write), in a just-in-time way | |||
Copy-on-write is often banking on the idea that that write may be rare for a lot of data, | |||
meaning you can transparently allocate a lot less storage, | |||
potentially | |||
: saving storage cost, | |||
: saving some up-front work (e.g. a linux fork() doesn't have to copy all allocated memory) | |||
: saving time when making the original copy. | |||
: having more fast storage (e.g. RAM) to use in general | |||
In linux there's even a daemon (ksmd[https://www.kernel.org/doc/Documentation/vm/ksm.txt][https://www.kernel.org/doc/html/v4.19/admin-guide/mm/ksm.html], initially made for VMs) | |||
that goes looking through program memory (where the program has set MADV_MERGEABLE) | |||
for identical identical pages, to make it copy-on-write. | |||
This | |||
* applies to RAM because processes always see only [[virtual memory]] addresses anyway | |||
:: see e.g. linux [[fork()]] making a copy-on-write memory space | |||
* applies to databases because they do their own management anyway | |||
* applies to filesystems because they do their own management anyway | |||
:: see e.g. LVM snapshots, ZFS snapshots | |||
In these examples, there is barely a way to subvert it if you wanted. | |||
Software examples can get more interesting | |||
* Qt uses copy-on-write, which affects the need for locking | |||
* C++98's string class was designed to allow implementations to back it with copy on write, but this was messy enough that C++11 dropped this | |||
''''Other meanings''' | |||
ZFS's writes have been described as copy-on-write, but this has a different meaning. | |||
They mean 'when writing to a block, we always first allocate and write a new block with a copy of the data and alterations as you asked', | |||
and ensures the new block is written to and becomes current ''before'' retiring the old one. | |||
While ''technically'' this is the same "sharing backed data for a while, until you can't", | |||
that time is actually intentionally very short, | |||
and the entire thing is ''only'' really there to avoid [[write hole]] problems (because yeah, this is slower than direct alteration). | |||
The concept appears in programming sometimes. | |||
For example, copy-on-write ''strings'' exist, the idea being you can share backing and save some space. | |||
It turns out there are often barely worth it, | |||
and they make correctness and concurrency a lot harder (consider e.g. affecting ongoing iterators). | |||
e.g. C++ basically outlawed them after a while. | |||
--> | |||
===Glossary=== | ===Glossary=== |
Revision as of 13:26, 14 July 2023
The lower-level parts of computers
General: Computer power consumption · Computer noises Memory: Some understanding of memory hardware · CPU cache · Flash memory · Virtual memory · Memory mapped IO and files · RAM disk · Memory limits on 32-bit and 64-bit machines Related: Network wiring notes - Power over Ethernet · 19" rack sizes Unsorted: GPU, GPGPU, OpenCL, CUDA notes · Computer booting
|
'Virtual memory' ended up doing a number of different things,
which for the most part can be explained separately.
Intro
Overcommitting RAM with disk: Swapping / paging; trashing
Page faults
See also
Swappiness
Practical notes
Linux
"How large should my page/swap space be?"
On memory scarcity
oom_kill
oom_kill is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time - as this is good indication that it's gotten to the point that we are trashing.
Killing processes sounds like a poor solution.
But consider that an OS can deal with completely running out of memory in roughly three ways:
- deny all memory allocations until the scarcity stops.
- This isn't very useful because
- it will affect every program until scarcity stops
- if the cause is one flaky program - and it usually is just one - then the scarcity may not stop
- programs that do not actually check every memory allocation will probably crash.
- programs that do such checks well may have no option but to stop completely (maybe pause)
- So in the best case, random applications will stop doing useful things - probably crash, and in the worst case your system will crash.
- delay memory allocations until they can be satisfied
- This isn't very useful because
- this pauses all programs that need memory (they cannot be scheduled until we can give them the memory they ask for) until scarcity stops
- again, there is often no reason for this scarcity to stop
- so typically means a large-scale system freeze (indistinguishable from a system crash in the practical sense of "it doesn't actually do anything")
- killing the misbehaving application to end the memory scarcity.
- This makes a bunch of assumptions that have to be true -- but it lets the system recover
- assumes there is a single misbehaving process (not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)
- ...usually the process with the most allocated memory, though oom_kill logic tries to be smarter than that.
- assumes that the system has had enough memory for normal operation up to now, and that there is probably one haywire process (misbehaving or misconfigured, e.g. (pre-)allocates more memory than you have)
- this could misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)
- assumes there is a single misbehaving process (not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)
Keep in mind that
- oom_kill is sort of a worst-case fallback
- generally
- if you feel the need to rely on the OOM, don't.
- if you feel the wish to overcommit, don't
- oom_kill is meant to deal with pathological cases of misbehaviour
- but even then might pick some random daemon rather than the real offender, because in some cases the real offender is hard to define
Tweak likely offenders, tweak your system.
- note that you can isolate likely offenders via cgroups now.
- and apparently oom_kill is now cgroups-aware
- oom_kill does not always save you.
- It seems that if your system is trashing heavily already, it may not be able to act fast enough.
- (and possibly go overboard once things do catch up)
- You may wish to disable oom_kill when you are developing
- ...or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.
- If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:
vm.panic_on_oom=1
and a nonzero kernel.panic (seconds to show the message before rebooting)
kernel.panic=10
See also