Virtual memory: Difference between revisions

Revision as of 20:31, 20 January 2024

The lower-level parts of computers

General: Computer power consumption · Computer noises

Memory: Some understanding of memory hardware · CPU cache · Flash memory · Virtual memory · Memory mapped IO and files · RAM disk · Memory limits on 32-bit and 64-bit machines

Unsorted: GPU, GPGPU, OpenCL, CUDA notes · Computer booting

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Intro

Swapping / paging; trashing

Overcommitting RAM with disk

Swappiness

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Practical notes

Linux

"How large should my page/swap space be?"

On memory scarcity

oom_kill

oom_kill is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time - as this is good indication that it's gotten to the point that we are trashing.

Killing processes sounds like a poor solution.

But consider that an OS can deal with completely running out of memory in roughly three ways:

deny all memory allocations until the scarcity stops.

This isn't very useful because

it will affect every program until scarcity stops

if the cause is one flaky program - and it usually is just one - then the scarcity may not stop

programs that do not actually check every memory allocation will probably crash.

programs that do such checks well may have no option but to stop completely (maybe pause)

So in the best case, random applications will stop doing useful things - probably crash, and in the worst case your system will crash.

delay memory allocations until they can be satisfied

This isn't very useful because

this pauses all programs that need memory (they cannot be scheduled until we can give them the memory they ask for) until scarcity stops

again, there is often no reason for this scarcity to stop

so typically means a large-scale system freeze (indistinguishable from a system crash in the practical sense of "it doesn't actually do anything")

killing the misbehaving application to end the memory scarcity.

This makes a bunch of assumptions that have to be true -- but it lets the system recover

assumes there is a single misbehaving process (not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)

...usually the process with the most allocated memory, though oom_kill logic tries to be smarter than that.

assumes that the system has had enough memory for normal operation up to now, and that there is probably one haywire process (misbehaving or misconfigured, e.g. (pre-)allocates more memory than you have)

this could misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)

Keep in mind that

oom_kill is sort of a worst-case fallback

generally

if you feel the need to rely on the OOM, don't.

if you feel the wish to overcommit, don't

oom_kill is meant to deal with pathological cases of misbehaviour

but even then might pick some random daemon rather than the real offender, because in some cases the real offender is hard to define

note that you can isolate likely offenders via cgroups now (also meaning that swapping happens per cgroup)

and apparently oom_kill is now cgroups-aware

oom_kill does not always save you.

It seems that if your system is trashing heavily already, it may not be able to act fast enough.

(and possibly go overboard once things do catch up)

You may wish to disable oom_kill when you are developing

...or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.

If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:

vm.panic_on_oom=1

and a nonzero kernel.panic (seconds to show the message before rebooting)

kernel.panic=10

@@ Line 184: / Line 184: @@
 -->
-===Page faults===
-<!--
-{{zzz|For context:
-Consider that in a system with a Virtual Memory Manager (VMM),
-applications only ever deal in {{comment|(what in the grander scheme turns out to be)}} virtual addresses;
-it is the VMM system's implementation that implies/does translation to real backing storage.
-So when doing any memory access, one of the tasks is making this access makes sense. For example, it may be that
-* the page is not known
-* the page is known, but not considered accessible
-* the page is known, considered accessible, and in RAM
-* the page is known, considered accessible, but cannot be accessed in the lightest sort of pass-through-to-RAM way.}}
-A '''page fault''', widely speaking, means "instead of direct access, the kernel needs to decide what to do now" - that last case {{comment|(some of the the others have their own names)}}.
-That signalling is called a page fault {{comment|(Microsoft also uses the term 'hard fault')}}.
-Note that it's a ''signal'' (caught by the OS kernel), and called a 'fault' only for historical, almost-electronic-level design reasons.
-A page fault can still mean one of multiple things.
-Most (but not even all) ground is covered by the following cases:
-'''Minor page fault''', a.k.a. '''soft page fault'''
-: Page is actually in RAM, but not currently marked in the MMU page table (often due to its limited size{{verify}})
-: resolved by the kernel updating the MMU with the knowledge it needs, ''then'' probably allowing the access.
-:: No memory needs to be moved around.
-:: little extra latency
-: (can happen around shared memory, or around memory that has been unmapped from processes but there had been no cause to delete it just yet - which is one way to implement a page cache)
-'''Major page fault''', a.k.a. '''hard page fault'''
-: memory is mapped, but not currently in RAM
-: i.e. mapping on request, or loading on demand -- which how you can do overcommit and pretend there is more memory (which is quite sensible where demand rarely happens)
-: resolved by the kernel finding some space, and loading that content. Free RAM, which can be made by swapping out another page.
-:: Adds noticeable storage, namely that of your backing storage
-:: the latter is sort of a "fill one hole by digging another" approach, yet this is only a real problem (trashing) when demand is higher than physical RAM
-'''Invalid page fault'''
-* memory isn't mapped, and there cannot be memory backing it
-: resolved by the kernel raising a [[segmentation fault]] or [[bus error]] signal, which terminates the process
-DEDUPE WITH ABOVE
-or not currently in main memory (often meaning swapped to disk),
-or it does not currently have the backing memory mapped{{verify}}.
-Depending on case, these are typically resolved either by
-* mapping the region and loading the content.
-: which makes the specific memory access significantly slower than usual, but otherwise fine
-* terminating the process
-: when it failed to be able to actually fetch it
-Reasons and responses include:
-* '''minor page fault''' seems to includes:{{verify}}
-** MMU was not aware that the page was accessible - kernel inform it is, then allows access
-** writing to copy-on-write memory zone - kernel copies the page, then allows access
-** writing to page that was promised by the allocated, but needed to be - kernel allocates, then allows access
-* mapped file - kernel reads in the requested data, then allows access
-* '''major page fault''' refers to:
-** swapped out - kernel swaps it back in, then allows access
-* '''invalid page fault''' is basically
-** a [[segmentation fault]] - send SIGSEGV (default SIGSEGV hanler is to kill the process)
-Note that most are not errors.
-In the case of a [[memory mapped IO]], this is the designed behaviour.
-Minor will often happen regularly, because it includes mechanisms that are cheap, save memory, and thereby postpone major page faults.
-Major ideally happens as little as possibly, because memory access is delayed by disk IO.
--->
-See also
-* http://en.wikipedia.org/wiki/Paging
-* http://en.wikipedia.org/wiki/Page_fault
-* http://en.wikipedia.org/wiki/Demand_paging
 ===Overcommitting RAM with disk===