Virtual memory: Difference between revisions
m (→Page faults) |
m (→Copy on write) |
||
Line 977: | Line 977: | ||
--> | --> | ||
===Page faults=== | |||
<!-- | |||
{{zzz|For context: | |||
Consider that in a system with a Virtual Memory Manager (VMM), | |||
applications only ever deal in {{comment|(what in the grander scheme turns out to be)}} virtual addresses; | |||
it is the VMM system's implementation that implies/does translation to real backing storage. | |||
So when doing any memory access, one of the tasks is making this access makes sense. For example, it may be that | |||
* the page is not known | |||
* the page is known, but not considered accessible | |||
* the page is known, considered accessible, and in RAM | |||
* the page is known, considered accessible, but cannot be accessed in the lightest sort of pass-through-to-RAM way.}} | |||
A '''page fault''', widely speaking, means "instead of direct access, the kernel needs to decide what to do now" - that last case {{comment|(some of the the others have their own names)}}. | |||
That signalling is called a page fault {{comment|(Microsoft also uses the term 'hard fault')}}. | |||
Note that it's a ''signal'' (caught by the OS kernel), and called a 'fault' only for historical, almost-electronic-level design reasons. | |||
A page fault can still mean one of multiple things. | |||
Most (but not even all) ground is covered by the following cases: | |||
'''Minor page fault''', a.k.a. '''soft page fault''' | |||
: Page is actually in RAM, but not currently marked in the MMU page table (often due to its limited size{{verify}}) | |||
: resolved by the kernel updating the MMU with the knowledge it needs, ''then'' probably allowing the access. | |||
:: No memory needs to be moved around. | |||
:: little extra latency | |||
: (can happen around shared memory, or around memory that has been unmapped from processes but there had been no cause to delete it just yet - which is one way to implement a page cache) | |||
'''Major page fault''', a.k.a. '''hard page fault''' | |||
: memory is mapped, but not currently in RAM | |||
: i.e. mapping on request, or loading on demand -- which how you can do overcommit and pretend there is more memory (which is quite sensible where demand rarely happens) | |||
: resolved by the kernel finding some space, and loading that content. Free RAM, which can be made by swapping out another page. | |||
:: Adds noticeable storage, namely that of your backing storage | |||
:: the latter is sort of a "fill one hole by digging another" approach, yet this is only a real problem (trashing) when demand is higher than physical RAM | |||
'''Invalid page fault''' | |||
* memory isn't mapped, and there cannot be memory backing it | |||
: resolved by the kernel raising a [[segmentation fault]] or [[bus error]] signal, which terminates the process | |||
DEDUPE WITH ABOVE | |||
or not currently in main memory (often meaning swapped to disk), | |||
or it does not currently have the backing memory mapped{{verify}}. | |||
Depending on case, these are typically resolved either by | |||
* mapping the region and loading the content. | |||
: which makes the specific memory access significantly slower than usual, but otherwise fine | |||
* terminating the process | |||
: when it failed to be able to actually fetch it | |||
Reasons and responses include: | |||
* '''minor page fault''' seems to includes:{{verify}} | |||
** MMU was not aware that the page was accessible - kernel inform it is, then allows access | |||
** writing to copy-on-write memory zone - kernel copies the page, then allows access | |||
** writing to page that was promised by the allocated, but needed to be - kernel allocates, then allows access | |||
* mapped file - kernel reads in the requested data, then allows access | |||
* '''major page fault''' refers to: | |||
** swapped out - kernel swaps it back in, then allows access | |||
* '''invalid page fault''' is basically | |||
** a [[segmentation fault]] - send SIGSEGV (default SIGSEGV hanler is to kill the process) | |||
Note that most are not errors. | |||
In the case of a [[memory mapped IO]], this is the designed behaviour. | |||
Minor will often happen regularly, because it includes mechanisms that are cheap, save memory, and thereby postpone major page faults. | |||
Major ideally happens as little as possibly, because memory access is delayed by disk IO. | |||
--> | |||
See also | |||
* http://en.wikipedia.org/wiki/Paging | |||
* http://en.wikipedia.org/wiki/Page_fault | |||
* http://en.wikipedia.org/wiki/Demand_paging | |||
===Copy on write=== | ===Copy on write=== | ||
Line 1,057: | Line 1,154: | ||
--> | --> | ||
===Glossary=== | ===Glossary=== |
Revision as of 20:31, 20 January 2024
The lower-level parts of computers
General: Computer power consumption · Computer noises Memory: Some understanding of memory hardware · CPU cache · Flash memory · Virtual memory · Memory mapped IO and files · RAM disk · Memory limits on 32-bit and 64-bit machines Related: Network wiring notes - Power over Ethernet · 19" rack sizes Unsorted: GPU, GPGPU, OpenCL, CUDA notes · Computer booting
|
Intro
Swapping / paging; trashing
Overcommitting RAM with disk
Swappiness
Practical notes
Linux
"How large should my page/swap space be?"
On memory scarcity
oom_kill
oom_kill is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time - as this is good indication that it's gotten to the point that we are trashing.
Killing processes sounds like a poor solution.
But consider that an OS can deal with completely running out of memory in roughly three ways:
- deny all memory allocations until the scarcity stops.
- This isn't very useful because
- it will affect every program until scarcity stops
- if the cause is one flaky program - and it usually is just one - then the scarcity may not stop
- programs that do not actually check every memory allocation will probably crash.
- programs that do such checks well may have no option but to stop completely (maybe pause)
- So in the best case, random applications will stop doing useful things - probably crash, and in the worst case your system will crash.
- delay memory allocations until they can be satisfied
- This isn't very useful because
- this pauses all programs that need memory (they cannot be scheduled until we can give them the memory they ask for) until scarcity stops
- again, there is often no reason for this scarcity to stop
- so typically means a large-scale system freeze (indistinguishable from a system crash in the practical sense of "it doesn't actually do anything")
- killing the misbehaving application to end the memory scarcity.
- This makes a bunch of assumptions that have to be true -- but it lets the system recover
- assumes there is a single misbehaving process (not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)
- ...usually the process with the most allocated memory, though oom_kill logic tries to be smarter than that.
- assumes that the system has had enough memory for normal operation up to now, and that there is probably one haywire process (misbehaving or misconfigured, e.g. (pre-)allocates more memory than you have)
- this could misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)
- assumes there is a single misbehaving process (not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)
Keep in mind that
- oom_kill is sort of a worst-case fallback
- generally
- if you feel the need to rely on the OOM, don't.
- if you feel the wish to overcommit, don't
- oom_kill is meant to deal with pathological cases of misbehaviour
- but even then might pick some random daemon rather than the real offender, because in some cases the real offender is hard to define
- note that you can isolate likely offenders via cgroups now (also meaning that swapping happens per cgroup)
- and apparently oom_kill is now cgroups-aware
- oom_kill does not always save you.
- It seems that if your system is trashing heavily already, it may not be able to act fast enough.
- (and possibly go overboard once things do catch up)
- You may wish to disable oom_kill when you are developing
- ...or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.
- If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:
vm.panic_on_oom=1
and a nonzero kernel.panic (seconds to show the message before rebooting)
kernel.panic=10
See also
Page faults
See also