Copy-on-write: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
m (Redirected page to Virtual memory#Copy on write)
Tag: New redirect
 
Line 1: Line 1:
<!--
#redirect [[Virtual_memory#Copy_on_write]]
 
Copy on write allows multiple ''logical''ly distinct chunks to be backed by the same ''physical'' chunk.
 
 
It's something you can do ''transparently'' when both allocation and access involve a layer of indirection that knows about this.
: ...because that's the only way it won't be easily subverted. In particular, it means that a write won't accidentally alter data underlying multiple copies.
: when a write happens, it will un-shared, a.k.a. get ts own physical copy (hence the name copy-on-write), in a just-in-time way
 
 
 
Copy-on-write is often banking on the idea that that write may be rare for a lot of data,
meaning you can transparently allocate a lot less storage,
potentially
: saving storage cost,
: saving some up-front work (e.g. a linux fork() doesn't have to copy all allocated memory)
: saving time when making the original copy.
: having more fast storage (e.g. RAM) to use in general
 
 
In linux there's even a daemon (ksmd[https://www.kernel.org/doc/Documentation/vm/ksm.txt][https://www.kernel.org/doc/html/v4.19/admin-guide/mm/ksm.html], initially made for VMs)
that goes looking through program memory (where the program has set MADV_MERGEABLE)
for identical identical pages, to make it copy-on-write.
 
 
 
 
 
This
* applies to RAM because processes always see only [[virtual memory]] addresses anyway
:: see e.g. linux [[fork()]] making a copy-on-write memory space
 
* applies to databases because they do their own management anyway
 
* applies to filesystems because they do their own management anyway
:: see e.g. LVM snapshots, ZFS snapshots
 
 
In these examples, there is barely a way to subvert it if you wanted.
 
 
Software examples can get more interesting
* Qt uses copy-on-write, which affects the need for locking
 
* C++98's string class was designed to allow implementations to back it with copy on write, but this was messy enough that C++11 dropped this
 
 
 
 
 
 
 
''''Other meanings'''
 
ZFS's writes have been described as copy-on-write, but this has a different meaning.
 
 
They mean 'when writing to a block, we always first allocate and write a new block with a copy of the data and alterations as you asked',
and ensures the new block is written to and becomes current ''before'' retiring the old one.
 
While ''technically'' this is the same "sharing backed data for a while, until you can't",
that time is actually intentionally very short,
and the entire thing is ''only'' really there to avoid [[write hole]] problems (because yeah, this is slower than direct alteration).
 
 
 
The concept appears in programming sometimes.
 
 
For example, copy-on-write ''strings'' exist, the idea being you can share backing and save some space.
It turns out there are often barely worth it,
and they make correctness and concurrency a lot harder (consider e.g. affecting ongoing iterators).
 
e.g. C++ basically outlawed them after a while.
 
 
-->

Latest revision as of 13:26, 14 July 2023