On computer memory

From Helpful
(Redirected from Trashing)
Jump to: navigation, search

CPU cache notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The basic idea behind CPU caches is is to put a little fast-but-costly memory between CPU and main RAM.

When reads from main memory happen to be mirrored in the cache, that read can be served much faster. This has been worth it since CPUs ran at a dozen MHz or so.


These caches are entirely transparent, in that you should not have to care about how it does its thing.

As a programmer you may want a rough idea. And only rough, as optimizing for specific hardware is pointless a few years later, yet designing for caches in general can help speed for longer. As can avoiding caches getting flushed more then necessary, as can avoiding cache contention - so it helps to know what that is and why it happens. And see when.


On virtual memory

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Intro

Overcommitting RAM with disk: Swapping / paging; trashing

Page faults

See also

Swappiness

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Practical notes

Linux

"How large should my page/swap space be?"

On memory scarcity

oom_kill

oom_kill is linux kernel code that starts killing processes when there is enough memory scarcity that memory allocations cannot happen within reasonable time - as this is good indication that it's gotten to the point that we are trashing.


Killing sounds like a poor solution, but consider that an OS can deal with completely running out of memory in roughly three ways:

  • deny all memory allocations until the scarcity stops. This isn't very useful because
it will affect every program until scarcity stops
if the cause is one flaky program - and it usually is - then the scarcity probably isn't going to stop
programs that do not actually check every memory allocation will probably crash.
programs that do such checks well may still have to decide to pause work, or stop completely
So in the best case, random applications will stop doing useful things, or crash. In the worst case, your system will crash.
  • delay memory allocations until they can be satisfied
this pauses all programs until scarcity stops
again, there is often no reason for this scarcity to stop
so typically means a large-scale system freeze (indistinguishable from a system crash in the practical ways of "it doesn't do anything")
  • killing the misbehaving application to end the memory scarcity.
works with the assumption is that the system has had enough memory for normal operation up to now, and that there is probably one process that is misbehaving or just misconfigured (e.g. pre-allocates more memory than you have).
this is assuming there is a single misbehaving (not always true)
...usually the process with the most allocated memory, though oom_kill tries to be smarter than that.
this could misfire on badly configured systems (e.g. multiple daemons all configured to use all RAM, or having no swap, leaving nothing to catch incidental variation)


Keep in mind that

  • You may wish to disable oom_kill when you are developing -- or at least equate an oom_kill in your logs as a fatal bug in the software that caused it.
  • oom_kill isn't really a feature you ever want to rely on.
It is meant to deal with pathological cases of misbehaviour - if it kills your useful processes, then you probably need to tweak either those processes or your system.
  • oom_kill does not always save you.
It seems that if it's trashing heavily already, it may not be able to act fast enough.
(and possibly go overboard later)
  • If you don't have oom_kill, you may still be able to get reboot instead, by setting the following sysctls:
vm.panic_on_oom=1
kernel.panic=10


See also



Glossary

On memory fragmentation

Slab allocation

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


The slab allocator does caches of fixed-size objects.

Slab allocation is often used in kernel modules/drivers that need to allocate uniform-sized and potentially short-lived structures - think task structures, filesystem internals, network buffers. These may be caches for each specific type.

There may also be arbitrary allocation of fixed sizes like 4K, 8K, 32K, 64K, 128K, etc, used for things that have known bounds but not precise sizes.


Upsides:

Each such cache is easy to handle
avoids fragmentation because all holes are of the same size,
that the otherwise-typical buddy system still has
making slab allocation/free simpler, and thereby a little faster
easier to fit them to hardware caches better

Limits:

It still deals with the page allocator under the cover, so deallocation patterns can still mean that pages for the same cache become sparsely filled - which wastes space.


SLAB, SLOB, SLUB:

  • SLOB: K&R allocator (1991-1999), aims to allocate as compactly as possible. Fragments fairly quickly.
  • SLAB: Solaris type allocator (1999-2008), as cache-friendly as possible.
  • SLUB: Unqueued allocator (2008-today): Execution-time friendly, not always as cache friendly, does defragmentation (mostly just of pages with few objects)


For some indication of what's happening, look at slabtop and slabinfo

See also:


There are some similar higher-level allocators "I will handle things of the same type" allocation, from some custom allocators in C, to object allocators in certain languages, arguably even just the implementation of certain data structures.

Memory mapped IO and files

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Note that memory mapped IO is a hardware-level construction, while memory mapped files are a software construction (...because files are).


Memory mapped files

Memory mapping of files is a technique (OS feature, system call) that pretends a file is accessible at some address in memory. At first, no file data will have been fetched at all. When the process accesses those memory locations, the OS will scramble for the actual contents from disk.

(and since this is handled in the OS's kernel, this interacts well with its filesystem cache).


For caching

Because of that interaction with the page cache, the data is and stays cached as long as there is RAM for it.


This can also save memory - without memory mapping, you'll often have allocate memory for it on the process's heap, while the OS also keeps it in filesystem cache for a bit.

With mmapping, you avoid that duplication, also meaning more data can be cached in the OS caches.


The fact that the OS can flush most or all of this data can be seen as a limitation or a feature - it's not always predictable, but it does mean you can deal with large data sets without having to think about very large allocations, and how those aren't nice to other apps.

shared memory via memory mapped files

Most kernel implementations allow multiple processes to mmap the same file -- which effectively shares memory, and probably one of the simplest in a protected mode system. (Some methods of Inter-Process communication work via mmapping)

You still need some way of not clobbering each other's work, of course.

The implementation, limitations, and method of use varies per OS / kernel.

Often relies on demand paging to work.


Memory mapped IO

Map devices into memory space (statically or dynamically), meaning that memory accesses to those areas are actually backed by IO accesses (...that you can typically also do directly).

This mapping is made and resolved at hardware-level thing, and only works for DMA-capable devices (which is many).

It seems to often be done to have a simple generic interface (verify) - it means drivers and software can avoid many hardware-specific details.


See also:

Memory limits on 32-bit and 64-bit machines

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


tl;dr:

  • If you want to use significantly more than 4GB of RAM, you want a 64-bit OS.
  • ...and since that is now typical, most of the details below are irrelevant


TODO: the distinction between (effects from) physical and virtual memory addressing should be made clearer.


Overall factoids

OS-level and hardware-level details:

From the I want my processes to map as much as possible angle:

  • the amount of memory a single process could hope to map is limited by its pointer size, so ~4GB on 32-bit OS, 64-bit (lots) on a 64-bit OS.
Technically this could be entirely about the OS, but in reality this tied intimately to what the hardware natively does, because anything else would be slooow.
  • Most OS kernels have a split (for their own ease) that means the practical limit of single 32-bit userspace processes is lower - 3GB, 2GB sometimes even 1GB
this is mostly some pragmatic implementation detail from back when 32 megabytes was a lot of memory, and leftover ever since


  • Each process is mapped to memory separately, so in theory you can host multiple 32-bit processes to together use more than 4GB
...even on 32-bit OSes: you can for example compile the 32-bit linux kernel to use up to 64GB this way
a 32-bit OS can only do this through PAE, which has to be supported and enabled in motherboard, and supported and enabled in the OS. (Note: Disabled in windows XP since SP2; see details below)
Note: both 32-bit and 64-bit PAE-supporting motherboards may have somewhat strange limitations, e.g. the amount of memory they will actually allow/support (mostly a problem in old, early PAE motherboards)
and PAE was problematic anyway - it's a nasty hack in nature, and e.g. drivers had to support it. In the end it was mostly seen in servers, where the details were easier to oversee.
  • device memory maps would take mappable memory away from within each process, which for 32-bit OSes would often mean that you couldn't use all of that installed 4GB



On 32-bit systems:

Process-level details:

  • No single 32-bit process can ever map more than 4GB as addresses are 32-bit byte-addressing things.
  • A process's address space has reserved parts, to map things like shared libraries, which means a single app can actually allocate less (often by at most a few hundred MBs) than what it can map(verify). Usually no more than ~3GB can be allocated, sometimes less.


On 64-bit systems:

  • none of the potentially annoying limitations that 32-bit systems have apply
(assuming you are using a 64-bit OS, and not a 32-bit OS on a 64-bit system).
  • The architecture lets you map 64-bit addresses -- or, in practice, more than you can currently physically put in any system.


On both 32-bit (PAE) and 64-bit systems:

  • Your motherboard may have assumptions/limitations that impose some lower limits than the theoretical one.
  • Some OSes may artificially impose limits (particularly the more basic versions of Vista seem to do this(verify))


Windows-specific limitations:

  • 32-bit Windows XP (since SP2) gives you no PAE memory benefits. You may still be using the PAE version of the kernel if you have DEP enabled (no-execute page protection) since that requires PAE to work(verify), but PAE's memory upsides are disabled (to avoid problems with certain buggy PAE-unaware drivers, possibly for other reasons)
  • 64-bit Windows XP: ?
  • /3GB switch moves the user/kernel split, but a single process to map more than 2GB must be 3GB aware
  • Vista: different versions have memory limits that seem to be purely artificial (8GB, 16GB, 32GB, etc.) (almost certainly out of market segregation)

Longer story / more background information

A 32-bit machine implies memory addresses are 32-bit, as is the memory address bus to go along. It's more complex, but the net effect is still that you can ask for 2^32 bytes of memory at byte resolution, so technically allows you to access up to 4GB.


The 'but' you hear coming is that 4GB of address space doesn't mean 4GB of memory use.


The device hole (32-bit setup)

One of the reasons the limit actually lies lower is devices. The top of the 4GB memory space (usually directly under the 4GB position) is used to map devices.

If you have close to 4GB of memory, this means part of your memory is now effectively missing - it is not addressible by the CPU. The size of this hole depends on chipset, BIOS configuration, the actual devices, and more(verify).


The BIOS settles the memory address map at boot time(verify), and you can inspect the effective map (Device Manager in windows, /proc/iomem in linux) in case you want to know whether it's hardware actively using the space (The hungriest devices tend to be video cards - at the time having two 768MB nVidia 8800s in SLI was one of the worst cases) or whether your motherboard just doesn't support more than, say, 3GB at all. Both these things can be the reason some people report seeing as little as 2.5GB out of 4GB you plugged in.


This problem goes away once you run a 64-bit OS on a 64-bit processor -- though there were some earlier motherboards that still had old-style addressing leftovers and hence some issues.


Note that the subset of these issues caused purely by limited address space on 32-bit systems could also be alleviated, using PAE:

PAE

It is very typical to use virtual memory systems. While the prime upside is probably the isolation of memory, the fact that a memory map is kept for each process also means that on 32-bit, each application has its own 4GB memory map without interfering with anything else (virtual mapping practice allowing).

Which means that while each process could use 4GB at the very best, if the OS could see more memory, it might map distinct 4GBs to each process so that collectively you can use more than 4GB (or just your full 4GB even with device holes).


Physical Address Extension is a memory mapping extension (not a hack, as some people think) that does roughly that. PAE needs specific OS support, but doesn't need to break the 32-bit model as applications see it.

It allowed mapping 32-bit virtual memory into the 36 bit hardware address space, which allows for 64GB (though most motherboards had a lower limit)


PAE implies some extra work on each memory operation, but because there's hardware support it only kicked a few percent off memory access speed.


All newish linux and windows version support PAE, at least technically. However:

  • The CPU isn't the only thing that accesses memory. Although many descriptions I've read seem kludgey, I easily believe that any device driver that does DMA and is not aware of PAE may break things -- such drivers are broken in that they are not PAE-aware - they do not know the 64-bit pointers that are used internally used should be limited to 36-bit use.
  • PAE was disabled in WinXP's SP2 to increase stability related to such issues, while server windowses are less likely to have problems since they use tend to use more standard hardware and thereby drivers.

Kernel/user split

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The kernel/user split, specific to 32-bit OSes, refers to an OS-enforced formalism splitting the mappable process space between kernel and each process.


It looks like windows by default gives 2GB to both, while (modern) linuces apparently split into 1GB kernel, 3GB application by default (which is apparently rather tight on AGP and a few other things).

(Note: '3GB for apps' means that any single process is limited to map 3GB. Multiple processes may sum up to whatever space you have free.)


In practice you may want to shift the split, particularly in Windows since almost everything that would want >2GB memory runs in user space - mostly databases. The exception is Terminal Services (Remote Desktop), that seems to be kernel space.

It seems that:

  • linuxes tend to allow 1/3, 2/2 and 3/1,
  • BSDs allow the split to be set to whatever you want(verify).
  • It seems(verify) windows can only shift its default 2/2 to the split to 1GB kernel, 3GB application, using the /3GB boot option (the feature is somewhat confusingly called 4GT), but it seems that windows applications are normally compiled with the 2/2 assumption and will not be helped unless coded to. Exceptions seem to primarily include database servers.
  • You may be able to work around it with a 4G/4G split patch, combined with PAE - with some overhead.

See also



Some understanding of memory hardware=

"What Every Programmer Should Know About Memory" is a good overview of memory architectures, RAM types, reasons bandwidth and access speeds vary.


DRAM and SRAM

EPROM, EEPROM, and variants

Flash memory

PRAM

Flash memory

Memory card types

These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.

See also http://en.wikipedia.org/wiki/Comparison_of_memory_cards


Secure Digital (SD, miniSD, microSD)

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
On size

There's SD, MiniSD and microSD, pin-compatible and varied sizes.

MiniSD never really took off, though, so it's mostly SD and microSD.

MicroSD is often seen in smartphones, SD and miniSD regularly in cameras, basic SD in a number of laptops.

Adapters to larger sizes exist (which are no more than plastic to hold the smaller card and wires to connect it). Card reader devices tend to have just the SD slot, assuming you have an adapter for smaller sizes to basic SD.


MicroSD was previously called TransFlash, and some areas of the world still prefer that name.



Families
  • SD (now als SDSC, standard capacity)
    • size artificially limited to 1-4GB
  • SDHC (high capacity)
    • physically identical but conforming to a new standard that allows for higher capacity and speed.
    • adressing limited to 32GB
  • SDXC (eXtended-Capacity )
    • successor to SDHC (2009) that allows for higher capacity and speed
    • adressing limited to 2TB
  • Ultra-Capacity (SDUC)
  • SDIO
    • allows more arbitrary communication, basically a way to plug in specific accessories, at least on supporting hosts - not really an arbitrarily usable bus for consumers
    • (supports devices like GPS, wired and wireless networking)


The above is more about capacity (and function), which isn't entirely aligned with SD versions.

Which means that protocolwise it's even more interesting - I at least have lost track.









Speed rating

There are two gotchas to speed ratings:

  • due to the nature of flash, it will read faster than it will write.
how much faster/slower depends, but it's easily a factor 2
if marketers can get away with it, they will specify the read speed
note that the differences vary, due to differences in controlles. E.g. external card readers tend to be cheap shit, though there are some examples of slow
  • writes can be faster in short bursts.
You usually care about sustained average write instead


Class

Should specify minimum sustained write speed.

  • Class 0 doesn't specify performance
  • Class 2 means ≥2 MB/s
  • Class 4 means ≥4 MB/s
  • Class 6 means ≥6 MB/s
  • Class 10 means ≥10 MB/s

It seems that in practice, this rating is a little approximate, mostly varying with honesty. A good Class 6 card may well perform better than a bad Class 10 one.


These days, most SDs can sustain 10MB/s writes (and not necessarily that much more), so the class system is no longer a useful indication of actual speed.


x rating

Units of 0.15MByte/sec.

Apparently not required to be sustained speed or write speed.(verify)

For some numeric idea:

  • 13x is roughly equivalent to class 2
  • 40x is roughly equivalent to class 6
  • 66x is roughly equivalent to class 10
  • 300x is ~45MB/s
  • 666x is ~100MB/s

MultiMediaCard (MMC)

The predecessor for SD, and regularly compatible in that SD hosts tend to also support MMC cards.


CompactFlash (CF)

Type I:

  • 3.3mm thick

Type II:

  • 5mm thick



Memory Stick (Duo, Pro, Micro (M2), etc.)

xD

A few types, sizes up to 512MB and 2GB varying with them.

Apparently quite similar to SmartMedia


SmartMedia (SM)

  • Very thin (for its length/width)
  • capacity limited to 128MB (...so not seen much anymore)

On fake flash

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Fake flash refers to a scam where cards's controller reports a larger size than there is actually storage.


These seem to come in roughly two variants:

addressing storage that isn't there will fail,
or it will wrap back on itself and write in existing area.


In both cases it will seem to work for a little while, and in both cases it will corrupt later. Exactly how and when depends on the type, and also on how it was formatted (FAT32 is likelier to fail a little sooner than NTFS due to where it places important filesystem data)


There are some tools to detect fake flash. You can e.g. read out what flash memory chips are in there and whether that adds up. Scammers don't go so far to fake this.

But the more thorough check is a write-and-verify test, see below.

Memory card health

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

While memory cards and USB sticks are flash memory, most lack the wear leveling and health introspection that e.g. SSDs have.

So generally they will work perfectly, until they start misbehaving, and without warning.


Things like windows chkdsk will by default only check the filesystem structure. While such tools can often be made to read all disk surface (for chkdsk it's the 'scan for an attempt recovery of bad sectors" checkbox), this only guarantees that everything currently on there can be read.

This is a decent check for overall failure, but not a check of whether any of it is entirely worn (and would will not take new data), and also not a test for fake flash.


The better test is writing data to all blocks, then checking back its contents. Depending a little on how this is done, this is either destructive, or only checks free space.


Yes, you can these checks relatively manually, but it's a little finicky to do right.


One useful tool is H2testw, which creates a file in free space (if empty, then almost all the SD card)

It will also tell you write and read speed.

And implicitly be a fake flash test.