Some understanding of memory hardware

From Helpful
(Redirected from SRAM)
Jump to navigation Jump to search

The lower-level parts of computers

General: Computer power consumption · Computer noises

Memory: Some understanding of memory hardware · CPU cache · Flash memory · Virtual memory · Memory mapped IO and files · RAM disk · Memory limits on 32-bit and 64-bit machines

Related: Network wiring notes - Power over Ethernet · 19" rack sizes

Unsorted: GPU, GPGPU, OpenCL, CUDA notes · Computer booting



"What Every Programmer Should Know About Memory" is a good overview of memory architectures, RAM types, reasons bandwidth and access speeds vary.


RAM types

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


DRAM - Dynamic RAM

lower component count per cell than most (transistor+capacitor mainly), so high-density and cheaper per storage size
yet capacitor leakage means it forgets its state, so this has to be refreshed regularly,
also meaning you need a DRAM controller, more complexity (not something you'd DIY), and higher latency than some
(...some latency is less of an issue when you have multiple chips)
this or a variant is typical as main RAM, due to low cost per bit


SDRAM - Synchronous DRAM - is mostly a practical design consideration

...that of coordinating the DRAM via an external clock signal (previous DRAM was asynchronous, manipulating state as soon as lines changed)
This allows the interface to that RAM to be a predictable state machine, which allows easier buffering, and easier interleaving of internal banks
which makes higher data rates a bunch simpler (though not necessarily lower latency)
SDR/DDR:
DDR doubled busrate by widening the (minimum) units they read/write (double that of SDR), which they can do from single DRAM bank(verify)
similarly, DDR2 is 4x larger units than SDR and DDR3 is 8x larger units than SDR
DDR4 uses the same width as DDR3, instead doubling the bus rate by interleaving from banks
unrelated to latency, it's just that the bus frequency also increased over time.



SRAM - Static RAM

Has a higher component count per cell (6 transistors) than e.g. DRAM
Retains state as long as power is applied to the chip, no need for refresh, also making it a little lower-latency
no external controller, so simpler to use
the higher component count per cell makes it more expensive per storage size
e.g used in caches, due to speed, and acceptable cost for lower amounts


PSRAM - PseudoStatic RAM

A design tradeoff, somewhere between SRAM and DRAM
its like DRAM with built-in refresh, so functionally it's as "don't think about it" as SRAM
(yes, DRAM technically can have built-in refresh, but that's often points a sleep mode that retains state without requiring an active DRAM controller, not something for active use)
it's slower than DRAM, and cheaper than SRAM
SRAM makes sense for internal RAM, PSRAM makes sense for extended RAM in situations DRAM is not necessary


Non-volatile RAM

While the concept of Random Access Memory (RAM) only tells you that you can access any part of it with comparable ease (contasted with e.g. tape storage, where more distance meant more time, so more storage meant more time)...

...we tend to think about RAM as volatile, only useful as an intermediate scratchpad between storage and use, and will lose its contents as soon as it is unpowered. Probably because the commonly chosen designs have that property.


Yet there are various designs that are both easily accessible and keep their state.

And there is a gliding scale of various properties in that area as well.


We may well call it NVM (non-volatile memory) when we haven't yet gotten to some more specific properties, like how often we may read or write, or how difficult that is.

Say, some variants of EEPROM aren't the easiest to deal with. We like Flash more, even though it's basically a development from EEPROM. But both wear out.

NVRAM on the other hand tends to be easier, more reisable, like FRAM, MRAM, and PRAM, or nvSRAM or even BBSRAM.


nvSRAM - SRAM and EEPROM stuck on the same chip.

seems intended as a practical improvement on BBSRAM
and/or a "access common stuff quickly, occasionally write a chunk to EEPROM" style of data logging, black boxes, that sort of thing
https://en.wikipedia.org/wiki/NvSRAM


BBSRAM - Battery Backed SRAM

basically just SRAM alongside a lithium battery, so that it'll live a good while
is sort of cheating, but usefully so.


FRAM - Ferroelectric RAM

functions more like flash, also limited in amount of use (but with many more cycles)
read process is destructive (like e.g DRAM), so you need a write-after-read to keep data around
so it's great for things like round-robin logging (which would be pretty bad for Flash)
https://electronics.stackexchange.com/questions/58297/whats-the-catch-with-fram


PRAM

https://en.wikipedia.org/wiki/Phase-change_memory



DRAM stick types

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


ECC RAM ('Error correction code')

can detect many (and correct some) hardware errors in RAM
The rate of of bit-flips is low, but will happen. If your computations or data are very important to you, you want ECC rather than the regular, non-ECC type.
See also:
http://en.wikipedia.org/wiki/ECC_memory
DRAM Errors in the Wild: A Large-Scale Field Study


Registered RAM (sometimes buffered RAM) basically places a buffer on the DRAM modules (register as in hardware register)

offloads some electrical load from the main controller onto these buffers, making it easier to have designs more stably connect more individual memory sticks/chips.
...at a small latency hit
typical in servers, because they can accept more sticks
Must be supported by the memory controller, which means it is a motherboard design choice to go for registered RAM or not
pricier (more electronics, fewer units sold)
because of this correlation with server use, most registered RAM is specifically registered ECC RAM
yet there is also unregistered ECC, and registered non-ECC, which can be good options on specific designs of simpler servers and beefy workstations.
sometimes called RDIMM -- in the same context UDIMM is used to refer to unbuffered
https://en.wikipedia.org/wiki/Registered_memory

FB-DIMM, Fully Buffered DIMM

same intent as registered RAM - more stable sticks on one controller
the buffer is now between stick and controller [1] rather than on the stick
physically different pinout/notching


SO-DIMM (Small Outline DIMM)

Physically more compact. Used in laptops, some networking hardware, some Mini-ITX


EPP and XMP (Enhanced Performance Profile, Extreme Memory Profiles)

basically, one-click overclocking for RAM, by storing overclocked timing profiles
so you can configure faster timings (and Vdimm and such) according to the modules, rather than your trial and error
normally, memory timing is configured according to a table in the SPD, which are JEDEC-approved ratings and typically conservative.
EPP and XMP basically means running them as fast as they could go (and typically higher voltage)


In any case, the type of memory must be supported by the memory controller

DDR2/3/4 - physically won't fit
Note that while some controllers (e.g. those in CPUs) support two generations, a motherboard will typically have just one type of memory socket
registered or not
ECC or not

Historically, RAM controllers were a thing on the motherboard near the CPU, while there are now various cases where the controller is on the CPU.

More on...

DRAM versus SRAM

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Separately, capacitors slowly leak charge anyway (related to closeby cells, related to their bulk addressing, and note that the higher the memory density, the smaller the capacitor so the sooner this all happens), so DRAM only makes sense with refresh: when there is something going through reading every cell and writing it back, just to keep the state over time.

The DRAM controller will refresh each DRAM row within (typically) 64ms, and there are order of thousands tens of thousands of them in a DRAM chip.


Yes, this means you randomly incur some extra latency.

Larger chips effectively have longer refresh overhead.

Each chip is slower-than-ideal, which can be made irrelevant by having the same amount of RAM in more chips on a memory stick. (Seems to also part of why servers often have more slots(verify))


It also means DRAM will require more power than most others, even when it's not being used.


With all these footnotes, DRAM seems clunky, so why use it?

Mainly because it's rather cheaper per bit (even with economy of scale in production), and as mentioned, you can alleviate the performance part fairly easily.



The first thing you'ld compare DRAM to is often SRAM (Static RAM), or some variant of it.

SRAM cells are more complex per bit, but don't need refresh, are fundamentally lower-latency than DRAM, and take less power when idle.

(with some variation; lower speed SRAM can be low power, whereas at high speeds and use power can can be comparable to DRAM)


The main downside is that due to their complexity, they are lower density, cost more silicon (and therefore money) per bit.

There are a lot of high-speed cases, or devices, where a little SRAM makes a lot of sense, like network switches, and also L1, L2, and L3 caches in your computer.


SRAM is electrically easier to access (also means you need less of a separate controller), so simple microcontrollers may prefer it, also because it's easier to embed on the same IC.


Since SRAM uses noticeably more silicon than DRAM per cell, SRAM is often under a few hundred kilobyte - in part because you'ld probably use SRAM for important bits, alongside DRAM for bulkier storage.



Pseudostatic RAM (PSRAM, a.k.a. PSDRAM) are ICs that contains both DRAM and a controller, so has DRAM speeds, but are as easy to use as SRAM, and a price somewhere inbetween.


There are even variants that are basically DRAM with an SRAM cache in front so that well controlled access patterns can be quite fast.


More DRAM notes:

For a few reasons (including that there are a lot of bits in the address, to save dozens of pins as well as silicon on internal demultiplexing), DRAM is typically laid out as a grid, and the address is essentially sent in two parts, the row and the column, sent one after the other.

This is what RAS and CAS are about - the first is a strobe that signals the row address can be used, the second that the column can be.

And, because capacitors are not instant, there needs to be some time between RAS and CAS, and between CAS and data coming out. This, and other details (e.g. precharge) are a property of the particular hardware, and should be adhered to to be used reliably.

Setting these parameters would be annoying, so on DDR DRAM sticks there is a small chip[2] that tells the BIOS you the timing options.


DRAM is also so dense that it has led to some electrical issues, e.g. the row hammer exploit.



Because you still spend quite bit of time on addressing, before the somewhat-faster readout of data, a lot of DRAM systems do prefetch/burst (what you'd call readahead in disks).

That is, instead of fetching a cell, it fetches (burst_length*bus_width), with burst_length apparently linked to DDR type, but 64 bytes for DDR3 and DDR4. (also because that's a common CPU cache line size)

This is essentially a forced locality assumption, but it's relatively cheap and frequently useful.



"RAS Mode"

lockstep mode, 1:1 ratio to DRAM clock
more reliable
independent channel mode, a.k.a. performance mode, 2:1 to DRAM clock
more throughput
also allows more total DIMMs (if your motherboard is populated with them)
mirror - seems to actually refer to memory mirroring.

Note this is about the channels, not the DRAM.

https://www.dell.com/support/article/nl/nl/nlbsdt1/sln155709/memory-modes-in-dual-processor-11th-generation-poweredge-servers?lang=en#Optimizer

https://software.intel.com/en-us/blogs/2014/07/11/independent-channel-vs-lockstep-mode-drive-you-memory-faster-or-safer




In PCs, the evolution from SDR SDRAM to DDR SDRAM to DDR2 SDRAM to DDR3 SDRAM is a fairly simple one.

SDR ():

  • single pumped (one transfer per clocktick)
  • 64-bit bus
  • speed is 8 bytes per transfer * memory bus rate

DDR (1998):

  • double pumped (two transfers per clocktick, using both the rising and falling edge)
  • 64-bit bus
  • speed is 8 bytes per transfer * 2 * memory bus rate

DDR2 (~2003):

  • double pumped
  • 64-bit bus (verify)
  • effective bus to memory is clocked at twice the memory speed
  • No latency reduction over DDR (at the same speed) (verify)
  • speed is 8 bytes per transfer * 2 * 2 * memory bus rate

DDR3 (~2007):

  • double pumped
  • 64-bit bus (verify)
  • effective bus to memory is clocked at four times the memory speed
  • No latency reduction over DDR2 (at the same speed) (verify)
  • speed is 8 bytes per transfer * 2 * 4 * memory bus rate

DDR4 (~2014)

  • double pumped

DDR5 (~2020)

Each generation also lowers voltage and thereby power (per byte).


(Note: Quad pumping exists, but is only really used in CPUs)


The point of clocking the memory bus higher than the speed of individual memory cells is that as long as you are accessing data from two distinctly accessed cells, you can send both on the faster external (memory) bus. (verify)

It won't be twice, but depending on access patterns might sometimes get close(verify).


Dual channel memory is different yet - it refers to using an additional 64-bit bus to memory in addition to the first 64 bits, so that you can theorhetically transfer at twice the speed. The effect this has on everyday usage depends a lot on what that use is, though. It seems that even the average case is not too noticeable improvement.


so four bits of data can be transferred per memory cell cycle. Thus, without changing the memory cells themselves, DDR2 can effectively operate at twice the data rate of DDR.


Within a type (within SDR, within DDR, within DDR2, etc.), the different speeds do not point to different design. Like with CPUs, it just means that the memory will work under that speed. Cheap memory may fail if clocked even just a littl higher, while much more tolerant memory also exists, which is interesting for overclockers.

Note that the bus speed a particular piece of memory will work under depends on how c



Transfers per clocktick:

  • 1 for SDR/basic SDRAM
  • 2 for DDR SDRAM
  • 4 for DDR2 SDRAM
  • 8 for DDR3 SDRAM
  • DDR4
  • DDR5


SDRAM was available as:

66MHz
100MHZ
133 MHz

DDR used to have some commonly used aliases, e.g.

alias     standard name   speed
PC-1600   DDR-200         100MHz, 200Mtransfers/s, peak 1.6GB/s
PC-2100   DDR-266         133MHz, 266Mtransfers/s, peak 2.1GB/s
PC-2700   DDR-333         166MHz, 333Mtransfers/s, peak 2.7GB/s
PC-3200   DDR-400         200MHz, 400Mtransfers/s, peak 3.2GB/s

As of right now (late 2008), DDR3 is not it money/performancewise, but DDR2 is interesting over DDR.

PC2-3200  DDR2-400        100MHz, 400Mtransfers/s, peak 3.2GB/s
PC2-4200  DDR2-533        133MHz, 533Mtransfers/s, peak 4.2GB/s
PC2-5400  DDR2-667        166MHz, 667Mtransfers/s, peak 5.4GB/s
PC2-6400  DDR2-800        200MHz, 800Mtransfers/s, peak 6.4GB/s

-->

ECC

Buffered/registered RAM

EPROM, EEPROM, and variants

PROM is Programmable ROM

can be written exactly once

EPROM is Erasable Programmable ROM.

often implies UV-EEPROM, erased with UV shone through a quartz window. You would e.g. tape that over to avoid it corrupting later.

EEPROM's extra E means Electrically Erasable

meaning it's now a command, and not a window.
early EEPROM read, wrote, and erased (verify) a single byte at a time. Modern EEPROM can work in larger chunks.
you only get a limited amount of erases (much like Flash. Flash is arguably just an evolution of EEPROM)


Flash memory (intro)

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Flash, eMMC, UFS