Virtual memory: Difference between revisions

Latest revision as of 18:06, 22 April 2024

The lower-level parts of computers

General: Computer power consumption · Computer noises

Memory: Some understanding of memory hardware · CPU cache · Flash memory · Virtual memory · Memory mapped IO and files · RAM disk · Memory limits on 32-bit and 64-bit machines

Unsorted: GPU, GPGPU, OpenCL, CUDA notes · Computer booting

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Intro

Swapping / paging; trashing

Overcommitting RAM with disk

On memory scarcity

"How large should my page/swap space be?"

Linux

Swappiness

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

oom_kill

oom_kill is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time (because that usually means we are already trashing).

Killing processes sounds like a poor solution.

But consider that an OS can deal with completely running out of memory in roughly three ways:

deny all memory allocations until the scarcity stops.

This isn't very useful because

it will affect every program until scarcity stops

if the cause is one flaky program - and it usually is just one - then the scarcity may not stop

programs that do not actually check every memory allocation will probably crash.

programs that do such checks well may have no option but to stop completely (maybe pause)

So in the best case, random applications will stop doing useful things - probably crash, and in the worst case your system will crash.

delay memory allocations until they can be satisfied

This isn't very useful because

this pauses all programs that need memory (they cannot be scheduled until we can give them the memory they ask for) until scarcity stops

again, there is often no reason for this scarcity to stop

so typically means a large-scale system freeze (indistinguishable from a system crash in the practical sense of "it doesn't actually do anything")

killing the misbehaving application to end the memory scarcity.

This makes a bunch of assumptions that have to be true -- but it lets the system recover

assumes there is a single misbehaving process (not always true, e.g. two programs allocating most of RAM would be fine individually, and needs an admin to configure them better)

...usually the process with the most allocated memory, though oom_kill logic tries to be smarter than that.

assumes that the system has had enough memory for normal operation up to now, and that there is probably one haywire process (misbehaving or misconfigured, e.g. (pre-)allocates more memory than you have)

this could misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)

Keep in mind that

oom_kill is sort of a worst-case fallback

generally

if you feel the need to rely on the OOM, don't.

if you feel the wish to overcommit, don't

oom_kill is meant to deal with pathological cases of misbehaviour

but even then might pick some random daemon rather than the real offender, because in some cases the real offender is hard to define

note that you can isolate likely offenders via cgroups now (also meaning that swapping happens per cgroup)

and apparently oom_kill is now cgroups-aware

oom_kill does not always save you.

It seems that if your system is trashing heavily already, it may not be able to act fast enough.

(and possibly go overboard once things do catch up)

You may wish to disable oom_kill when you are developing

...or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.

If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:

vm.panic_on_oom=1

and a nonzero kernel.panic (seconds to show the message before rebooting)

kernel.panic=10

Virtual memory: Difference between revisions

Latest revision as of 18:06, 22 April 2024

Contents

Intro

Swapping / paging; trashing

Overcommitting RAM with disk

On memory scarcity

"How large should my page/swap space be?"

Linux

Swappiness

oom_kill

Page faults

Copy on write

Glossary

Navigation menu

@@ Line 6: / Line 6: @@
 ===Intro===
 <!--
-{{comment|(Note: this is a broad-strokes introduction that simplifies and ignores a lot of historical evolution of how we got where we are and ''why'' - a bunch of which I know I don't know)}}.
+{{comment|(Note: this is a broad-strokes introduction that simplifies and ignores a lot of historical evolution of how we got where we are and ''why'' - plus a bunch of which ''I know I don't know yet'')}}.
 'Virtual memory' describes an abstraction that we ended up using for a number of different things.
-For the ''most'' part, you can explain those things separately, though some are entangled (in ways that mostly operating system programmers need to worry about).
+For the ''most'' part, you can explain those reasons separately,
+though they got entangled over time (in ways that ''mostly'' operating system programmers need to worry about).
-At low level, memory access is "ask for an address, do request, get back result".
+At low level, memory access is "set an address, do a request, get back result".
-In olden times, every program could access all memory,
+In olden times,
-themselves and directly in the sense that there is nothing to keep you from doing it.
+this described hardware that did nothing more than that {{comment|(in some cases you even needed to do that yourself: set a value on address pins, flip the pin that meant a request, and read out data on some other pins)}},
+and the point is that there was ''nothing'' that keeps you from doing any request you want.
-Because you all used the same memory space,
+Because you all used the same memory space, memory management was a... cooperative thing where everything needed to play nice.
-memory management was a more cooperative thing.
+But that was hard, and beyond conventions to what parts were operating system and you wouldn't touch,
+there were no standards to multiple processes running concurrently, unless they actively knew about each other.
-But that was hard, and beyond conventions to what bits were operating system and you wouldn't touch,
+Which was fine, because multitasking wasn't a buzzword yet.
-there were no standards to multiple processes running concurrently unless they actively knew about each other.
+We ran one thing at a time, and the exceptions to that were clever about how they did that.
-Which was fine because multitasking wasn't a buzzword yet.
-We ran one thing at a time, with few exceptions.
+To skip a ''lot'' of history {{comment|(the variants on the way are a mess to actually get into)}},
+what we have now is a '''virtual memory system''',
+where
+* each task gets its own address space.
+* there is something managing these assignments parts of memory to tasks
+* our running code ''never'' deals ''directly'' with physical addresses.
+* and when a request is made, ''something'' is doing translation between the addresses that the program sees, and the physical addresses and memory that actually goes to.
-To skip a lot of history, what we now have is a '''virtual memory system''',
+{{comment|The low level implementation is also interesting, in that there there is hardware assisting this setup - things would be terribly slow if it weren't.  At the same time, these details are also largely irrelevant, in that it's always there, and fully transparent even to programmers)}}
-where running code never deals ''directly'' with physical addresses.
-It just means there's something inbetween - mostly there to be a little cleverer for you.
-Now, each task gets its own address space.
-and ''something'' is doing translation between the addresses that the program sees, and the physical addresses and memory that actually goes to.
-The low level implementation is ''interesting'' (like the fact that hardware is actually assisting this),
+There are a handful of reasons this addresses-per-task idea is useful.
-but also less relevant in that it's always there, and there is little consequence for you - little to optimize.
+One of them is just convenience.
+If the OS tells you where to go,
+you avoid overwriting other tasks accidentally.
+Arguably the more important one is '''protected memory''':
+if that lookup can easily say "that was never allocated to you, ''denied''",
+meaning a task can never accidentally ''or'' intentionally access memory it doesn't own.
+{{comment|(There is no overlap in ownership until this is intentional, you specifically ask for it, and the OS specifically allows it - a.k.a. [[shared memory]].)}}
-No matter the actual numbers the addresses have within each task, they can't clash in physical memory (or rather, ''won't'' overlap until you specifically ask for it and the OS specifically allows it - see [[shared memory]]).
+This is useful for stability,
+in that a user task can't bring down a system task accidentally,
+as was easy in the "everyone can trample over everyone" days.
+Misbehaving tasks will ''probably'' fail in isolation.
-There are a handful of reasons this addresses-per-task idea is useful.
+It's also great for security,
+in that tasks can't ''intentionally'' access what any other task is doing.
-The larger among these ideas is '''protected memory''':
-that lookup can easily say "according to me, that was never allocated to you, ''denied''",
-meaning a task can never accidentally access memory it doesn't own.
-This is useful for stability, in that a user task can't bring down a system task accidentally, as was easy in the "everyone can trample over everyone" days.
-Misbehaving tasks will fail in isolation instead.
-It's also great for security, in that tasks can't ''intentionally'' access what any other task is doing.
@@ Line 189: / Line 194: @@
 <!--
-As mentioned, swapping/paging has the effect that the VMM can have a pool of virtual memory that could be backed from RAM ''and'' disk.
+As mentioned, swapping/paging has the effect that
+the VMM can have a pool of virtual memory that could be backed from RAM ''and'' from disk.
 ''"Can you choose to map or allocate more total memory than would all fit into RAM at the same time?"''
-Yes. And a small degree of this is even ''common''.
+Yes.
+And a small degree of this is even ''common''.
-Using disk for memory seems like a bad idea, as disks are significantly slower than RAM in both bandwidth and latency.
+Using disk for memory seems like a bad idea, because disks are significantly slower than RAM in both bandwidth and latency.
-(This was especially true in the platter days, but is still true in the SSD days)
-Which is why the VMM will always prefer to use RAM.
+''Especially'' with platter days, but is still true in the SSD days.
+Which is why the VMM will always prefer to use RAM when it has it.
-This can be considered overcommit of RAM, though note this is ''not'' the only meaning the term overcommit (or even the usual one), see below.
+This "...and also disk" can be considered overcommit of RAM,
+though note this is ''not'' the only meaning the term overcommit (or even the usual one), see below.
@@ Line 531: / Line 541: @@
 * https://serverfault.com/questions/362589/effects-of-configuring-vm-overcommit-memory
+-->
+====On memory scarcity====
+<!--
+On a RAM-only system you will find you at some point cannot find free pages.
+When you've added swap and similar features,
+you may find your bookkeeping says it can be done,
+but in practice it will happen very slowly.
+Also, having disconnected programs from the backing store,
+only the kernel can even guess at how bad that is.
+The most obvious case is more pages being actively used than there is physical RAM (can happen without overcommit, more easily with), but there are others. Apparently things like hot database backups may create so many [[dirty pages]] so quickly that the kernel decides it can't free anywhere near fast enough.
+In a few cases it's due to a sudden (reasonable) influx of dirty pages, but otherwise transient.
+But in most cases scarcity is more permanent, means we've started swapping and probably [[trashing]], making everything slow.
+Such scarcity ''usually'' comes from a single careless / runaway,
+sometimes just badly configured (e.g. you told more than one that they could take 80% of RAM), sometimes from a slew of (probably-related) programs.
+-->
+<!--
+=====SLUB: Unable to allocate memory on node=====
+SLUB is [[slab allocation]], i.e. about dynamic allocation of kernel memory
+This particular warning seems most related to a bug in memory accounting.
+It seems more likely to happen around containers with cgroup kmem accounting,
+(not yet stable in 3.x, and apparently there are still footnotes in 4.x)
+but happens outside as well?
+There was a kernel memory leak
 -->
@@ Line 538: / Line 599: @@
 <!--
-There used to be advice like "Your swap file needs to be 1.5x RAM size", but this is arbitrary.
+There used to be advice like "Your swap file needs to be 1.5x RAM size", and tables to go along.
+The tables's values varying wildly shows just how arbitrary this is.
+That they are usually 20 years old more so.
-As show by that suggested multiplier changeing with time,
+It depends significantly with the amount of RAM you have.
-and even I've even seen tables that also vary that factor significantly with RAM.
-Depends on use.
+But also, it just depends on use.
 Generally, the better answer is to consider:
@@ Line 558: / Line 621: @@
 : less than a GB in most cases, and a few GB in a few
-* too much swap doesn't really hurt
+* unused swap space doesn't really hurt (other than in disk space)
@@ Line 660: / Line 723: @@
 <!--
 There is an aggressiveness with with an OS will swap out allocated-but-inactive pages to disk.
@@ Line 667: / Line 731: @@
 Linux calls this ''swappiness''.
-Higher swappiness mean the general tendency to swap out is higher - though other, more volatile information is used too, including the system's currently mapped ratio, a measure of how much trouble the kernel has recently had freeing up memory.
+Higher swappiness mean the general tendency to swap out is higher.
+This general swappiness is combined with other (often more volatile) information,
+including the system's currently mapped ratio,
+a measure of how much trouble the kernel has recently had freeing up memory,
+and some per-process (per-page) statistics.
@@ Line 866: / Line 935: @@
 =====oom_kill=====
-<tt>oom_kill</tt> is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time - as this is good indication that it's gotten to the point that we are trashing.
+<tt>oom_kill</tt> is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time (because that usually means we are already [[trashing]]).
@@ Line 922: / Line 991: @@
 See also
 * http://mirsi.home.cern.ch/mirsi/oom_kill/index.html
 ===Page faults===