Difference between revisions of "On computer memory"

From Helpful
Jump to: navigation, search
m (Secure Digital (SD, miniSD, microSD))
m (Overcommit or not)
Line 659: Line 659:
 
====Overcommit or not====
 
====Overcommit or not====
 
<!--
 
<!--
Various programs ask for more memory than they immediately, or ever use.
 
Some a ''lot'' more.
 
  
 +
tl;dr:
 +
* Overcommit means allowing allocation of address space, without allocating memory to back it.
 +
: Windows makes you do both of those explicitly, implying it does not allow overcommit.
 +
: Linux allows that separation. Basically by having the kernel only allocate pages on their first use.
  
Usually a moderate but not large amount, and just because they're just lazy, and aware that any modern OS's VMM (virtual memory manager) will not actually allocate memory until it is used.
+
* In the practice of almost all modern systems you want swap.
  
Sometimes because of the nature of problems (e.g. highly scarce arrays) and it being so much simpler to count on a VMM than to, essentially, write one yourself.
+
* having enough swap space makes overcommit largely unnecessary
 +
: You can argue e.g. windows strict accounting is more predictable than e.g. linux's reliance on oom_killer
 +
: You can also argue that being in either situation is a case of bad planning or bad behaviour
  
Sometimes because of features and cleverness, like fork()ed processes's memory pages being shared and copy-on-write, meaning that while you need to assume it uses twice the memory, the actual use may sometimes be much smaller.
+
* Windows doesn't overcommit, linux does.
  
 +
* Some interesting cases exist:
 +
: VM hosts often ''do'' allow RAM overcommit towards their VMs.
 +
: having your own memory allocator (e.g. Java) may do something sort of overcommittish.
  
  
Assuming that many programs do this, the virtual memory system can choose to promise more than we have,
 
which will run comfortably as long as less than ~RAM amount gets actively used (and sluggishly or into the ground above that).
 
  
It's usually fine because we know typical behaviour, and far outside typical behaviour you're screwed ''anyway''.
+
The commit limit (= maximum allocation limit) is still a fixed number, and still ''related'' to the amount of RAM, but the relation can be more interesting.
In various ways it's sort of like [[fractional reserve]] banking.
+
  
 +
On windows it's:
 +
swap space + RAM
 +
and linux's is essentially
 +
swap space + (RAM * (overcommit_ratio/100) )
  
  
The commit still needs to be remembered, of course.
 
  
Often this is counted towards swapped memory.
 
  
Note that even without overcommit, you can still count allocated-but-unused pages towards swap.
+
----
  
But also, it's easy to remember that committable memory, if not exactly RAM+swap, still should roughly relate to RAM+swap.
 
  
Also because pages we used once and never again may ''actually'' end up there, so it's a sensible mechanic to present to the interested user/admin, probably better than adding another disembodied figure to our system stat apps.
+
While relation to swapping and cacheing is indirect - it is relevant in a slightly wider view.
  
 +
You want swap space on both windows and linux (and in both cases have it be roughly similar to your RAM size)
 +
because both will count it towards their commit limit - and it will allow rarely/never used pages to be in swap rather than RAM (in the case of ''never''-used without any actual work).
  
  
  
Counterarguments include that system stability should not be based on bets, that programs should not be so lazy, and that we're actively enabling them to be lazy and less predictable for admins, who now have to figure out what that [[#oom_kill|oom_kill]] happened.
+
 
 +
In part because both will swap out rarely(/never) used things to it,
 +
in favour of actively using RAM for caches.
 +
And you want this regardless of whether you overcommit or not.
 +
 
 +
(windows is actually more agressive about swapping things out - it seems to favour favour of IO caches), linux less so and it's a little more tweakable (swappiness) - though presumably the "background services / programs")
 +
 
 +
 
 +
 
 +
Basically,
 +
* swap makes sense if the amount of memory you use rarely, e.g. once and never again
 +
: because we can use the RAM for other things, like later actively used memory (or caches)
 +
 
 +
* overcommit makes sense if you have significant memory you reserve but ''never'' use
 +
: which is, in some views, entirely unnecesssary
 +
: it should probably be seen as a minor optimization, and not a feature you should (ab)use
 +
 
 +
 
 +
 
 +
 
 +
* with or without overcommit, swap file counts towards your limit.
 +
 
 +
* that this is not about whether it will swap, how much, or how quickly
 +
 
 +
 
 +
 
 +
Note that
 +
 
 +
* that's how much it can allocate, not where it allocates from)
 +
 
 +
* windows puts more importance on the swap file
 +
 
 +
* you don't really want to go without swap file/space on either windows or linux
 +
: (more so if you turn autocommit off on linux)
 +
 
 +
* look again at that linux equation. That's ''not'' "swap plus more-than-100%-of-RAM"
 +
: and note that if you have very little swap and or tons of RAM (think >100GB), it can mean your commit limit is lower than RAM
 +
 
 +
 
 +
 
 +
----
 +
 
 +
 
 +
There are some specific cases where overcommit helps. For example, when you fork() a process in linux,
 +
it will consider the pages copy-on-write. Meaning while it needs to map all memory it doesn't need to duplicate it.
 +
 
 +
So a large program doing fork() to exec() something will not accidentally leak memory.
 +
 
 +
But fork-and-exec is sort of a stupid mechanism (windows doesn't have fork() so doesn't have this problem),
 +
and
 +
 
 +
 
 +
 
 +
 
 +
https://serverfault.com/questions/362589/effects-of-configuring-vm-overcommit-memory
 +
 
 +
 
 +
 
 +
 
 +
 
 +
----
 +
 
 +
 
 +
Counterarguments to overcommit include that system stability should not be based on bets,
 +
that it is (ab)use of an optimization that you should not be counting on,
 +
that programs should not be so lazy,
 +
and that we are actively enabling them to be lazy and behave less predictably,
 +
and now sysadmins have to frequently figure out why that [[#oom_kill|oom_kill]] happened.
  
  
Line 700: Line 774:
 
Neither are ideal.  
 
Neither are ideal.  
  
Arguabvly oom_kill is typically smarter, usually killing only an actually misbehaving program.
+
Arguably oom_kill is typically smarter, usually killing only an actually misbehaving program.
 
Rather than a denial probably killing the next program (more random).
 
Rather than a denial probably killing the next program (more random).
  
Line 718: Line 792:
  
  
Committed_AS
+
Linux has three modes:
 +
* overcommit_memory=2: No overcommit
 +
* overcommit_memory=1: Overcommit without checks/limits. More likely to swap and OOM.
 +
* overcommit_memory=0: Overcommit with heuristic checks (default)
  
 +
While you may like the idea of disabling overcommit, 1 and 2 are, effectively, less careful.
 +
It's like running windows without a swap file which, if you've ever tried it, you'll know you don't really want. It'll ''work'' but not do what you want (or think, necessarily), and the same is true of linux.
  
CommitLimit
+
(You may instead want low [[swappiness]])
  
 
vm.overcommit_memory (default is 0, may vary with distro)
 
: 0, overcommit with heuristics estimating if it's fine
 
: 1, overcommit without limits. More likely to swap, more likely to OOM, fine for programs that are use lazy allocation or when forking things that will mostly read, sometimes necessary
 
: 2 (since kernel 2.6), don't overcommit, limit is swap + RAM * vm.overcommit_ratio
 
  
 
vm.overcommit_memory (default is 50)
 
vm.overcommit_memory (default is 50)
 
: that default 50% is why people suggest the amount of swap space is at least 50% of your physical RAM (of less, a single program can allocate less than all of physical RAM - in overcommit mode 2)
 
: that default 50% is why people suggest the amount of swap space is at least 50% of your physical RAM (of less, a single program can allocate less than all of physical RAM - in overcommit mode 2)
 +
 +
So, overcommit_memory can be set above 100. It's not really overcommit until you do,
 +
because up to that point .
 +
 +
In fact, the default 50 may mean your commit limit is ''less' than your RAM.
 +
 +
Consider e.g:
 +
* 2GB swap, 4GB RAM, 50%: commit limit is 4GB
 +
* 2GB swap, 16GB RAM, 50%: commit limit is 10GB
 +
 +
 +
overcommit_ratio < 100 means applications cannot use all RAM (collectively or individually).
 +
 +
From the views of application, it is a sort of ''under''commit,
 +
but it's effectively a way to reserve the rest for caches {{verify}},
 +
which usually you want. Applications that want near-100% by design of memory are usually bad news,
 +
potentially ''even'' if they use that largely for their own cacheing.
 +
 +
 +
 +
 +
https://serverfault.com/questions/362589/effects-of-configuring-vm-overcommit-memory
 +
 +
https://www.win.tue.nl/~aeb/linux/lk/lk-9.html
 +
 +
 +
 +
 +
 +
 +
* https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
https://www.win.tue.nl/~aeb/linux/lk/lk-9.html
 +
 +
 +
 +
 +
 +
  
  

Revision as of 18:05, 1 July 2019

CPU cache notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The basic idea is to put a little fast-but-costly memory between CPU and main RAM. If reads from main memory happen to be already mirrored in the cache, that read can be served much faster. This has been worth it since CPUs ran at a dozen MHz or so.


The thing about CPU cache is that they are entirely transparent. You should not have to care about how it does its thing.

As a programmer you may want a rough idea. Designing for specific hardware is pointless a few years later, but designing for caches at all can help speed. As can avoiding caches getting flushed more then necessary, as can avoiding cache contention - so it helps to know what that is and why it happens, and how to see that your weird speed thing is one of these things.


Virtual Memory systems

On virtual memory

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Glossary

See also:



On swapping

Swapping / paging

These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.

Windows calls it paging, *nix usually calls it swapping.

The terms paging and swapping aren't always synonymous, but have been close enough for a while now. (Specific cases have more specific names, and bordercases are funny anyway so do too)


The terms refer to OSes allowing overcommit: having more virtual memory than physical RAM, backed by disk where necessary. This lets the memory manager say 'yes' to memory allocations somewhat beyond what you actually have.

This seems like a bad idea, as disks are orders slower than RAM, but there are reasons to do this:

  • there may be RAM that is typically inactive, think "not accessed in days, possibly won't be ever".
so the VMM may adaptively move that to disk, freeing up RAM for active use
  • there are programs that blanket-allocate RAM, and will never even use it once.
so the VMM may choose to back this with anything (physical RAM or disk) once it starts being used
it's typical that it counts this towards the use of swap/page area (without actually initializing space there) until the first use, so that the bookkeeping still makes sense
which is why you'ld then want swap/page area, even when it's never used


Doing the two things above won't impact active use much. Moving inactive memory to disk will rarely slow anything - and allows more programs to use fast RAM.

The net effect is often actually positive, because the former means more programs can use fast RAM without hitting disk, because the latter saves a lot of IO at such allocation time.


tl;dr: The idea is that

  • all of the actively used pages are in RAM
  • infrequent pages are on disk, and
  • never-used pages are nowhere.


However...



For example, a program's virtual memory space is often made of some part that is never used, (often small) bits that are continuously used, and everything inbetween. Sections that are never used need not be committed, sections that are rarely used may be paged out if that means that other active programs can fit more comfortably in main memory.


You can argue about how exactly you would implement this. For example, do you want the OS to page things out on demand, or pre-emptively (basically: when do you want it to churn - when you need the RAM, or in idle time before it, assuming there will be such idle time).


It seems that *nix swapping logic is smart enough to do a RAID-like spread among its swap devices, meaning that a swap partition on every disks that isn't used by something important (like a database) is useful.


-->

Swapping and trashing

When you get more allocation you cannot serve from free RAM, there is still sense to moving the least-used allocation to disk.

In the long run, some memory has to come from and go to disk, but only sporadically.


However, chances are you are now near a point where you actively using more allocated memory than you have RAM.

Trashing is what happens when programs actively combine to use more memory than there is physical memory to back it.

Some of the memory has to come from and go to disk, and continuously instead of sporadically. Intuitively, some of it is actually disk, so everything starts going at disk speed.

Or, with enough active processes, worse, because each task switch to a new process potentially comes along with a "move someone else out so I can move in".


Page faults
"How large should my page file be?"

There is a lot of misinformation about what the page file's / swap space should be.

  • There is the suggestion that too large slows down your system
Nope. It's just a waste of space
  • You see suggestions like "1.5 times RAM size"
but this is arbitrary, and actually makes less sense the more RAM you have
  • Page file of size zero is possible
and yes, everything will always be in fast RAM
...but including stuff that never gets used, which cannot be moved for actual use, say, various caching


The better answer is to consider:

that last bit: the ability/usefulness to swap out inactive stuff (usually few GB worth at most)
useful to allow/back overcommit (usually few GB worth at most)
your workload, and what that may add in the worst case (amount varies)
so servers may sometimes be unusual cases, but workstations can often do fine with a handful of GB.

See also


Swappiness

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The aggressiveness with with an OS swap out allocated-but-inactive pages to disk is often controllable. Linux dubs this swappiness.


Higher swappiness values mean the tendency to swap out is higher. (other information is used too, including the currently mapped ratio, and a measure of how much trouble the kernel has recently had freeing up memory)(verify)


In linux you can use proc or sysctl to check and set swappiness

cat /proc/sys/vm/swappiness
sysctl vm.swappiness

...shows you the current swappiness (a number between 0 and 100), and you can set it with something like:

echo 60 >  /proc/sys/vm/swappiness
sysctl -w vm.swappiness=60


Note that the meaning of the value was never very settled, and has changed with kernels versions (for example, (particularly later) 2.6 kernels swap out more easily under the same values than 2.4). Some kernels do little swapping for values in the range 0-60 (or 0-80, but 60 seems the more common tipping point). A value of 100 or something near it tends to make for very aggressive swapping.


There are many discussions on the subject because the kernel implementations, implications, and applicability to different use cases varies, and some of the interactions are not trivial.


Generally:

  • If you have a lot more memory than you use, there won't be much swapping anyway, so it doesn't matter so much. Having a lot more RAM than programs use will is arguably a case for low swappiness, since you might as well keep everything in memory (unless you specifically value the system's file cache over lesser-used program memory)


Arguments for lower swappiness:

  • Avoids swapping until it's necessary for something else
    • ...also avoiding IO (also lets drives spin down, which can matters to laptop users)
    • (on the flipside, when you want to allocate memory and the system needs to swap out things first to provide that memory, it means more work, IO, and sluggishness at that time)
  • apps are more likely to stay in memory (particularly larger ones). Over-aggressive swapout (e.g. inactivity because you went for coffee) is less likely, meaning it is slightly less likely that you have to wait for a few seconds of churning swap-in when you continue working
  • When your computer has more memory than you actively use, there will be less IO caused by swapping inactive pages out and in again (but there are other factors that also make swapping less likely in such cases)


Arguments for higher swappiness seem to include(verify):

  • keeps memory free
    • which values (possible) future use of memory, over keeping (possibly never-used) things in memory
    • free memory is usable by the OS page cache
    • (free memory being truly unusused is almost completely pointless, though)
  • swapping out rarely used pages means new applications and new allocations are served more immediately by RAM, rather than having to wait for swapping in
  • allocation-greedy apps will not cause swapping so quickly, and are served more quickly themselves



From a perspective of data caching, you can see swappiness as something that indirectly controls where cache data sits - process, OS cache, or swapped out.

Swappiness applies mostly to process's memory, not to kernel constructs like the OS page cache, dentry cache, and inode cache. This means that more aggressive swapping puts program memory on disk, and makes more space to cache filesystem data.


Consider for example the case of large databases (often following some 80/20-ish locality patterns). If you can make the database cache data in process memory, you may want lower swappiness, since that makes it more likely that needed data is still in memory.

If you disable such in-process caching of tables, then you may get almost the same effect because the space freed is used by the page cache to cache the most common data, and you may want higher swappiness so that the OS cache gets more space - which may actually means less OS work in the end (depending on how the tables are used).


In some cases, having a lot of memory and relying on the OS cache can take all the bother out of cacheing, and avoid duplication in multiple processes that are backed by the same filesystem data. Consider serving a search system with large indices that is made so that multiple processes serve the same data - in this case, relying on the page is a better idea than each process loading the data into process memory. This effect often also applies to database systems to some degree.


It should be said that the OS cache's logic isn't as complex or smart as swapping logic, and neither is as smart as a program theoretically could be. Even a simple LRU cache (including memcaches) may work more efficiently than relying on the page cache. This also starts to matter when you fit a lot of different programs onto the same server so they start vying for limited memory.



See also:

Page fault

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

When a memory access accesses something that is mapped into memory but not currently present in physical memory, hardware informs software. That signalling is called a page fault (Microsoft also uses the term 'hard fault').


A page fault, widely speaking, means "instead of direct access, the kernel needs to decide what to do now".

Reasons and responses include:

  • minor page fault seems to includes:(verify)
    • MMU was not aware that the page was accessible - kernel inform it is, then allows access
    • writing to copy-on-write memory zone - kernel copies the page, then allows access
    • writing to page that was promised by the allocated, but needed to be - kernel allocates, then allows access
  • mapped file - kernel reads in the requested data, then allows access
  • major page fault refers to:
    • swapped out - kernel swaps it back in, then allows access
  • invalid page fault is basically


Note that most are not errors.

In the case of a memory mapped IO, this is the designed behaviour.


Minor will often happen regularly, because it includes mechanisms that are cheap, save memory, and thereby postpone major page faults.

Major ideally happens as little as possibly, because memory access is delayed by disk IO.

On memory scarcity

Overcommit or not

oom_kill

oom_kill is linux kernel code that starts killing processes when there is memory scarcity.

(Scarcity meaning it cannot find/free pages fast enough to satisfy a program's memory allocation. The most obvious trigger is overcommit to the point of more pages being actively used than there is physical RAM, but there are others. Apparently things like hot database backups may create so many dirty pages so quickly that the kernel decides it can't free anywhere near fast enough.)


Killing sounds bad, but consider that an OS can deal with completely running out of memory in roughly three ways:

  • block memory allocations until they can be satisfied - but under scarcity this is likely to be very long-term and will effectively halt most programs in the system, and easily be an effective deadlock
  • deny all memory allocations until the scarcity stops. This isn't very useful because
if the cause is one flaky program - and it usually is - then the scarcity isn't going to stop
most programs do not actually check every memory allocation for success, which means half your programs will misbehave or crash anyway. Even if they do such checks, there usually is no useful reaction other than stopping the process.
So in the best case, random applications will stop doing useful things, or crash. In the worst case, your system will crash.
  • killing the memory-hungriest application to end the memory scarcity.
works with the assumption is that the system has had enough memory for normal operation up to now, and that there is probably one process that is misbehaving or just misconfigured (e.g. pre-allocates more memory than you have).
...usually this means simply the process with the most allocated memory, though oom_kill tries to be smarter than that.
this could misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)


Keep in mind that

  • scarcity
usually comes from a single careless/runaway program
occasionally comes from a few programs that assume they can take most physical memory, or are configured to
in some cases it's the dirty-pages thing
  • scarcity also means it's swapping, probably trashing, making everything slow


  • You may wish to disable oom_kill when you are developing -- or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.
  • oom_kill isn't really a feature you ever want to rely on.
It is meant to deal with pathological cases of misbehaviour - if it kills your useful processes, then you probably need to tweak either those processes or your system.
  • oom_kill does not always save you.
It seems that if it's trashing heavily already, it may not be able to act fast enough.
(and possibly go overboard later)
  • If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:
vm.panic_on_oom=1
kernel.panic=10


See also

Slab allocation

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


The slab allocator allows creation of caches that hold objects of a fixed-size type.

Slab allocation is mainly used in kernel modules/drivers that need to allocate uniform-sized non-persistent and potentially short-lived structures (think device structures, task structures, filesystem internals, network buffers).

Aside from caches for specific types, there are also arbitrary allocation of fixed sizes like 4K, 8K, 32K, 64K, 128K, etc.


Upsides:

Each such cache is easy to handle,
avoids fragmentation because all holes are of the same size,
that the otherwise-typical buddy system still has
and the work to most calls is lighter than a full malloc/free would be,
and in some cases works well with hardware caches.

Limits:

It still deals with the page allocator under the cover, so deallocation patterns can still mean that pages for the same cache become sparsely filled - which wastes space.


SLAB, SLOB, SLUB:

  • SLOB: K&R allocator (1991-1999), aims to allocate as compact as possible. Fragments fairly quickly.
  • SLAB: Solaris type allocator (1999-2008), as cache-friendly as possible.
  • SLUB: Unqueued allocator (2008-today): Execution-time friendly, not always as cache friendly, does defragmentation (mostly just of pages with few objects)


slabtop

slabinfo


See also:



There are some similar higher-level things that follow the "I will handle things of the same type" allocation, from some custom allocators in C, to object allocators in certain languages, arguably even just the implementation of certain data structures.

Lower level

See also:



Memory limits on 32-bit and 64-bit machines

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


tl;dr:

  • If you want to use significantly more than 4GB of RAM, you want a 64-bit OS.
  • ...and since that is now typical, most of the details below are irrelevant


TODO: the distinction between (effects from) physical and virtual memory addressing should be made clearer.


Overall factoids

OS-level and hardware-level details:

From the I want my processes to map as much as possible angle:

  • the amount of memory a single process could hope to map is limited by its pointer size, so ~4GB on 32-bit OS, 64-bit (lots) on a 64-bit OS.
Technically this could be entirely about the OS, but in reality this tied intimately to what the hardware natively does, because anything else would be slooow.
  • Most OS kernels have a split (for their own ease) that means the practical limit of single 32-bit userspace processes is lower - 3GB, 2GB sometimes even 1GB
this is mostly some pragmatic implementation detail from back when 32 megabytes was a lot of memory, and leftover ever since


  • Each process is mapped to memory separately, so in theory you can host multiple 32-bit processes to together use more than 4GB
...even on 32-bit OSes: you can for example compile the 32-bit linux kernel to use up to 64GB this way
a 32-bit OS can only do this through PAE, which has to be supported and enabled in motherboard, and supported and enabled in the OS. (Note: Disabled in windows XP since SP2; see details below)
Note: both 32-bit and 64-bit PAE-supporting motherboards may have somewhat strange limitations, e.g. the amount of memory they will actually allow/support (mostly a problem in old, early PAE motherboards)
and PAE was problematic anyway - it's a nasty hack in nature, and e.g. drivers had to support it. In the end it was mostly seen in servers, where the details were easier to oversee.
  • device memory maps would take mappable memory away from within each process, which for 32-bit OSes would often mean that you couldn't use all of that installed 4GB



On 32-bit systems:

Process-level details:

  • No single 32-bit process can ever map more than 4GB as addresses are 32-bit byte-addressing things.
  • A process's address space has reserved parts, to map things like shared libraries, which means a single app can actually allocate less (often by at most a few hundred MBs) than what it can map(verify). Usually no more than ~3GB can be allocated, sometimes less.


On 64-bit systems:

  • none of the potentially annoying limitations that 32-bit systems have apply
(assuming you are using a 64-bit OS, and not a 32-bit OS on a 64-bit system).
  • The architecture lets you map 64-bit addresses -- or, in practice, more than you can currently physically put in any system.


On both 32-bit (PAE) and 64-bit systems:

  • Your motherboard may have assumptions/limitations that impose some lower limits than the theoretical one.
  • Some OSes may artificially impose limits (particularly the more basic versions of Vista seem to do this(verify))


Windows-specific limitations:

  • 32-bit Windows XP (since SP2) gives you no PAE memory benefits. You may still be using the PAE version of the kernel if you have DEP enabled (no-execute page protection) since that requires PAE to work(verify), but PAE's memory upsides are disabled (to avoid problems with certain buggy PAE-unaware drivers, possibly for other reasons)
  • 64-bit Windows XP: ?
  • /3GB switch moves the user/kernel split, but a single process to map more than 2GB must be 3GB aware
  • Vista: different versions have memory limits that seem to be purely artificial (8GB, 16GB, 32GB, etc.) (almost certainly out of market segregation)

Longer story / more background information

A 32-bit machine implies memory addresses are 32-bit, as is the memory address bus to go along. It's more complex, but the net effect is still that you can ask for 2^32 bytes of memory at byte resolution, so technically allows you to access up to 4GB.


The 'but' you hear coming is that 4GB of address space doesn't mean 4GB of memory use.


The device hole (32-bit setup)

One of the reasons the limit actually lies lower is devices. The top of the 4GB memory space (usually directly under the 4GB position) is used by devices. If your physical memory is less than that, this means using addresses you wouldn't use anyway, but if you have 4GB of memory, this means part of your memory is now effectively missing. The size of this hole depends on chipset, BIOS configuration, video card, and more(verify).

Assume for a moment you have a setup with a 512MB device hole - that would mean the 32-bit address range between 3.5GB and 4GB addresses devices instead of memory. If you have 4GB of memory plugged in, that last 512MB remains unassigned, and is effectively useless as it is entirely invisible to the CPU and everything else.


The BIOS settles the memory address map at boot time(verify), and you can inspect the effective map (Device Manager in windows, /proc/iomem in linux) in case you want to know whether it's hardware actively using the space (The hungriest devices tend to be video cards - the worst current case is probably two 768MB nVidia 8800s in SLI) or whether your motherboard just doesn't support more than, say, 3GB at all. Both these things can be the reason some people report seeing as little as 2.5GB out of 4GB you plugged in.


This is not a problem when using a 64-bit OS on a 64-bit processor -- unless, of course, your motherboard makes it one; there are various reported cases of this too.

Problems caused purely by limited address space on 32-bit systems can also be alleviated, using PAE:

PAE

In most computers today, memory management refers to cooperation between the motherboard and the operating system.

Application are isolated from each other via virtual memory mapping. A memory map is kept for each process, meaning each can pretend they're alone on the computer. Each application has its own 4GB memory map without interfering with anything else (virtual mapping practice allowing).

Physical Address Extension, a hardware feature, is a memory mapping extension (not a hack, as some people think) that uses the fact that this memory map is a low-level thing. Since application memory locations are virtualised anyway, the OS can just map various 32-bit application memory spaces into the 36 bit hardware address space that PAE allows, which allows for 64GB (though most motherboards have a lower limit, for somewhat arbitrary reasons) This also solves the device hole problem, since the previously unmappable-and-therefore-unused RAM can now be comfortably mapped again (until the point where you can and actually do place 64GB in your computer).

PAE doesn't need to break the 32-bit model as applications get to see it. Each process can only see 4GB, but a 32-bit OS's processes can collectively use more real memory.


PAE implies some extra work on each memory operation, which in the worst case seems to kick a few percent off memory access speed. (so if without PAE you see 3.7GB out of the 4GB you actually have, it can be worth it to leave PAE off)

In the (relatively rare) cases where a program wants to handle so much memory that the normal allocation methods would deny it, that program can add PAE code (working with the OS).


All newish linux and windows version support PAE, at least technically; particularly windows may disable it for you. Everything since the last generation or two of 32-bit hardware 64-bit processors can be assumed to support PAE -- to some degree. However:

  • The CPU isn't the only thing that accesses memory. Although many descriptions I've read seem kludgey, I easily believe that any device driver that does DMA and is not aware of PAE may break things -- such drivers are broken in that they are not PAE-aware - they do not know the 64-bit pointers that are used internally used should be limited to 36-bit use.
  • PAE was disabled in WinXP's SP2 to increase stability related to such issues, while server windowses are less likely to have problems since they use tend to use more standard hardware and thereby drivers.

Kernel/user split

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The kernel/user split, specific to 32-bit OSes, refers to an OS-enforced formalism splitting the mappable process space between kernel and each process.


It looks like windows by default gives 2GB to both, while (modern) linuces apparently split into 1GB kernel, 3GB application by default (which is apparently rather tight on AGP and a few other things).

(Note: '3GB for apps' means that any single process is limited to map 3GB. Multiple processes may sum up to whatever space you have free.)


In practice you may want to shift the split, particularly in Windows since almost everything that would want >2GB memory runs in user space - mostly databases. The exception is Terminal Services (Remote Desktop), that seems to be kernel space.

It seems that:

  • linuxes tend to allow 1/3, 2/2 and 3/1,
  • BSDs allow the split to be set to whatever you want(verify).
  • It seems(verify) windows can only shift its default 2/2 to the split to 1GB kernel, 3GB application, using the /3GB boot option (the feature is somewhat confusingly called 4GT), but it seems that windows applications are normally compiled with the 2/2 assumption and will not be helped unless coded to. Exceptions seem to primarily include database servers.
  • You may be able to work around it with a 4G/4G split patch, combined with PAE - with some overhead.

See also


Memory card types

These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.

See also http://en.wikipedia.org/wiki/Comparison_of_memory_cards


Secure Digital (SD, miniSD, microSD)

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
On size

There's SD, MiniSD and microSD, pin-compatible and varied sizes.

MiniSD never really took off, though, so it's mostly SD and microSD.

MicroSD is often seen in smartphones, SD and miniSD regularly in cameras, basic SD in a number of laptops.

Adapters to larger sizes exist (which are no more than plastic to hold the smaller card and wires to connect it). Card reader devices tend to have just the SD slot, assuming you have an adapter for smaller sizes to basic SD.


MicroSD was previously called TransFlash, and some areas of the world still prefer that name.



Families
  • SD (now als SDSC, standard capacity)
    • size artificially limited to 1-4GB
  • SDHC (high capacity)
    • physically identical but conforming to a new standard that allows for higher capacity and speed.
    • adressing limited to 32GB
  • SDXC (eXtended-Capacity )
    • successor to SDHC (2009) that allows for higher capacity and speed
    • adressing limited to 2TB
  • Ultra-Capacity (SDUC)
  • SDIO
    • allows more arbitrary communication, basically a way to plug in specific accessories, at least on supporting hosts - not really an arbitrarily usable bus for consumers
    • (supports devices like GPS, wired and wireless networking)


The above is more about capacity (and function), which isn't entirely aligned with SD versions.

Which means that protocolwise it's even more interesting - I at least have lost track.









Speed rating

There are two gotchas to speed ratings:

  • due to the nature of flash, it will read faster than it will write.
how much faster/slower depends, but it's easily a factor 2
if marketers can get away with it, they will specify the read speed
note that the differences vary, due to differences in controlles. E.g. external card readers tend to be cheap shit, though there are some examples of slow
  • writes can be faster in short bursts.
You usually care about sustained average write instead


Class

Should specify minimum sustained write speed.

  • Class 0 doesn't specify performance
  • Class 2 means ≥2 MB/s
  • Class 4 means ≥4 MB/s
  • Class 6 means ≥6 MB/s
  • Class 10 means ≥10 MB/s

It seems that in practice, this rating is a little approximate, mostly varying with honesty. A good Class 6 card may well perform better than a bad Class 10 one.


These days, most SDs can sustain 10MB/s writes (and not necessarily that much more), so the class system is no longer a useful indication of actual speed.


x rating

Units of 0.15MByte/sec.

Apparently not required to be sustained speed or write speed.(verify)

For some numeric idea:

  • 13x is roughly equivalent to class 2
  • 40x is roughly equivalent to class 6
  • 66x is roughly equivalent to class 10
  • 300x is ~45MB/s
  • 666x is ~100MB/s

MultiMediaCard (MMC)

The predecessor for SD, and regularly compatible in that SD hosts tend to also support MMC cards.


CompactFlash (CF)

Type I:

  • 3.3mm thick

Type II:

  • 5mm thick


Memory Stick (Duo, Pro, Micro (M2), etc.)

xD

A few types, sizes up to 512MB and 2GB varying with them.

Apparently quite similar to SmartMedia


SmartMedia (SM)

  • Very thin (for its length/width)
  • capacity limited to 128MB (...so not seen much anymore)


On fake flash

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Fake flash refers to a scam where cards's controller reports a larger size than there is actually storage.


These seem to come in roughly two variants:

addressing storage that isn't there will fail,
or it will wrap back on itself and write in existing area.


In both cases it will seem to work for a little while, and in both cases it will corrupt later. Exactly how and when depends on the type, and also on how it was formatted (FAT32 is likelier to fail a little sooner than NTFS due to where it places important filesystem data)


There are some tools to detect fake flash. You can e.g. read out what flash memory chips are in there and whether that adds up. Scammers don't go so far to fake this.

But the more thorough check is a write-and-verify test, see below.

Memory card health

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

While memory cards and USB sticks are flash memory, most lack the wear leveling and health introspection that e.g. SSDs have.

So generally they will work perfectly, until they start misbehaving, and without warning.


Things like windows chkdsk will by default only check the filesystem structure. While such tools can often be made to read all disk surface (for chkdsk it's the 'scan for an attempt recovery of bad sectors" checkbox), this only guarantees that everything currently on there can be read.

This is a decent check for overall failure, but not a check of whether any of it is entirely worn (and would will not take new data), and also not a test for fake flash.


The better test is writing data to all blocks, then checking back its contents. Depending a little on how this is done, this is either destructive, or only checks free space.


Yes, you can these checks relatively manually, but it's a little finicky to do right.


One useful tool is H2testw, which creates a file in free space (if empty, then almost all the SD card)

It will also tell you write and read speed.

And implicitly be a fake flash test.