Data consistency and versioning, and its concepts

Database related

More theoretical - thinking about databases:

Broad qualifiers, a.k.a. thinking about what you want from your database ·
Describing database systems
- Data consistency and versioning, and its concepts
- Database model types · Some databases, sorted by those types

Everyday-use notes

ACID

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

ACID (Atomic, Consistent, Isolated, Durable) is a set of properties you want to guarantee that the data it shows you is correct and consistent and up to date at all times. 'ACID compliant' an be intuited as 'a guarantee that my data is safe and correct'.

For a large part, that is about how a sequence of interactions is handled, in particular

dealing with sequences of statements that must be applied all together, or not at all: transactions

parallel execution

because most real practice has more than one user

because it make most parts a lot harder (arguably, there should be a P in the acronym, because parallelism is usually the part that introduces the most complexity, edge cases, and tradeoffs, and is sort of important to understood well)

The parts:

Durability

roughly: when the backend acknowledges a request, this change will survive, and you'll get it back as-is, even around various typical malfunctions

in part a 'guaranteed to be actually written to disk', in part about recovery, in part about ensuring other ACID parts during recovery

and implies implementation details like "write a new copy, then retire the old one" style journaling (see also e.g. WAL) is a popular solution - instead of e.g. "update/overwrite the existing data as you go" because while that can often be a little faster, failure halfway through might make that unrecoverable)

Consistency

roughly: data respects all imposed integrity rules (constraints, cascades, triggers, etc.) at all times

e.g.

attempting a operation that isn't allowed will be rejected, and will never lead to showing invalid state to clients at any time

if the transaction turns out to violate a constraint by one of its later actions, it must be rolled back

this is basically impossible to do without atomicity and/or isolation

Atomicity

roughly: a set of changes either happens fully, or not at all

since many operations in a transaction aren't even individually atomic, this certainly requires extra bookkeeping (e.g. of before and after states)

the easiest method is locking, but this also puts a very hard limit on concurrency. Concurrent methods are more involved.

Isolation

roughly:

transactions should not interfere with each other when concurrent

ongoing interactions are invisible to each other; even when actually handled in parallel, they should look to any client as if they were done sequentially; what one transaction alters is hidden from all others

largely means they cannot react to new state from another transaction

required for consistency

this conflicts with concurrency (more fundamentally than some think - varying somewhat with how complex the data structures are)

(at the very least, things become more complex when the data they deal with overlaps in any way)

The original explanation of ACID remains a fairly open-ended description, not a very string set of constraints, and certainly not rules of how to implement that.

If some of these descriptions seem to overlap, that's because they do.

Which leads to arguments like "is what isolation says already covered by atomicity?", so it's probably there because we planned to relax it from day two.

In fact, it seems to have been added later indeed;

ACD from J Gray (1981) "The Transaction Concept: Virtues and Limitations"'

ACID comes from T Haerder and A Reuter (1983), "Principles of transaction-oriented database recovery"

One way you can see the A/I difference is that

atomicity is about all of a transaction's writes becoming visible atomically as table state

isolation about different reads in another transaction not seeing different versions at different times within it

one focuses more on correctness, the other more on ordering

consider you could write a system that switches in one transactions's new state perfectly atomically, and have another transaction read from the earlier state at one time, and the new state later (Nonrepeatable reads, see below)

Such details is also one reason SQL defines isolation levels, which actually relax from strict ACIDity.

So yes, real-world relational databases are typically not configured to be fully ACID (and a few can't even be).

This is also roughly why a lot of NoSQL often does not care about strict transactional correctness.

http://itu.dk/~mogel/SIDD2011/lectures/SIDD.2011.13.pdf

http://jimgray.azurewebsites.net/papers/thetransactionconcept.pdf

https://www.cs.umb.edu/~poneil/iso.pdf

https://www.xaprb.com/blog/2014/12/08/eventual-consistency-simpler-than-mvcc/#fn:2

ACIDity of RDBMSes versus NoSQL

ACID is historically largely related to RDBMSes.

While ACID is just a set of principles that any system could follow, maby RDBMs are designed with the idea that without considering ACID in your design, you'll make some sort of mess.

NoSQL is somewhat different.

Most NoSQL simplifies the data model it can store (and usually also removes transactions) so that it needs fewer guarantees for it to work.

This makes ACID almost irrelevant. If you meant ACID to mean "operations probably not incorrect", you mostly have that:

Atomicity may be simpler - particularly when you are restricted to only 'get' and 'put' single items.

and it is generally not guaranteed on bulk operations as a whole. They'll say something like 'best effort'.

Isolation is often irrelevant (no transactions)

Consistency has almost no rules to adhere to (beyond rejecting nonsense requests), so is fine

(Durability is a separate conversation)

Sure, from the context of relational, NoSQL breaks relational, but that was almost the point.

It can be useful to dig a little deeper.

Consider

ACIDity of single-row (or key) alterations, versus

ACIDity of multi-row (or key) alterations.

A RDBMSes engine needs to give multi-row ACIDity to be taken remotely seriously.

Most NoSQL isn't multi-row ACID because there are no multi-row alterations in the first place. You often don't get to group alterations in transactions that get applied atomically.

Are they single-row ACID. Usually easily.

The issue comes in when they call themselves ACID because they're single-row ACID, but they also happen to allow multi-row updates.

If you do that from your code, the result is not ACID overall. You don't get that guarantee.

And we would really like multi-row updates. Particularly around denormalization, it would be great if we can say "Update this related set of denormalized calculated fields, either all or none, because that means the app won't show something inconsistent". By and large, we cannot. We are running either on 'hope for the best', or something inbetween because some BASE implementations are pretty neat.

But technically we weren't lying.

Also, Good luck knowing for sure.

The D is sort of a separate case.

Durability doesn't mention time.

In RDBMSes, it often means that 'if I acknowedge, it means it's guaranteed to be actually written to disk'. Be it tablespace or recovery log, you'll see this data back unless something goes unusually wrong.

NoSQL is split between

Write-Ahead Logging like that, and

"sure, we save our in-memory state to disk every few minutes"

(and a few "neh, we're in-memory only", which is fair)

And in a lot of use, it's a fine tradeoff. Just know that you're making it, and that this might be a bit of a stretch of D.

Aside from NewSQL, there are few NoSQL cases that try to give you multi-row ACIDity (or detail the cases).

VoltDB [1]

FoundationDB [2]

You can ask "Is S3 ACID?" and get completely different answers, including

basically yes, the individual operations are very predictable

It only does get, put, list, and delete.

get and put of completely independent items barely gets into any consistency.

list may not reflect the objects present a second later, but you would only expect that if there were transactions, which there are not.

No. It's not a database, it's an object store. Your question doesn't make sense.

No. It does not give read-after-write consistency[3]

(this changed in 2021, basically after 15 years, and still seems slightly qualified)

Maybe; you could add something like transactions on top,

but it'll probably work out as batches and fallback cleanup. Close enough, right?

You still don't get multiple such interactions working in a serialized way. No I or A, really.

But hey, you generally wouldn't expect to.

Also, if you use S3 as a backend to a metadata store elsewhere - which is common and convenient - that's even less likely to stay consistent with each other over time.

Isolation levels

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

ACID is great theory, but it turns out that most parallelism will easily be faster only under some conditions, and in various real-world ones things can still come down to "do things largely in series, or it becomes way too complex to handle".

So another question pops up: Do you want a tradeoff between which guarantees we make, for extra speed?

Isolation levels are a controlled way to loosen strictly ACID behaviour in RDBMSes.

Roughly speaking, isolation levels control what sorts of non-ACID behaviour is allowed whenever you don't do explicit, fairly complete locking.(verify)

Three things you want to avoid (at least, the most mentioned and relevant):

Dirty reads - refers to fetches that ignore any transaction or locking

Means we don't have to wait at all - great for speed!

May e.g. mean one connection may read data that is not yet committed by another

when multiple clients are writing relational data that has to make coherent sense, you can assume this will go wrong at some point

Nonrepeatable reads: getting different values for the same row when trying to read it multiple times in the same transaction

...e.g. because of another transaction's committed alterations. Reads can be non-dirty, but still non-repeatable.

We usually care about nonrepeatable reads refer to those within the same transaction. There is also the concept of "nonrepeatable within the same query". The fact that we use the same word for both is confusing, yes.

Phantom behavior / phantom reads refers to transactions 'magically' seeing new rows

i.e, seeing that the result set changes (rather than that specifically identificable row values change)

e.g. caused by WHERE clauses that select from another table at different times, which can find new matching rows after they are committed by something else.

Approximately speaking,

dirty reads are reading non-committed data from another transaction

nonrepeatable reads are often because another transaction COMMITted an UPDATE

phantoms are often because another transaction COMMITted an INSERT or DELETE)

'Approximately', because with more scrutiny there are a few more specific cases .

You don't have to think about those terms directly, because SQL-92 defines isolation levels:

READ UNCOMMITTED
- Basically 'anything you find will do', does not use/observe read locks
- allows dirty read
- allows nonrepeatable reads
- allows phantom behaviour

READ COMMITTED
- Won't read uncommitted data, doesn't ensure that read rows won't change within the length of this transaction
- avoids dirty reads (only)
- allows nonrepeatable reads
- allows phantom behaviour

"Won't read uncomitted data" and "allows nonrepeatable reads" seem to conflict. The point lies in the difference between a query and a transaction. Nonrepeatable reads in a query means it will not see different values if it needs to read a row twice. But two identical queries in the same transaction can still see data if another transaction has completely committed data to the relevant table. If you want all state to be repeatable within your transaction, you want either locks, or the following:

REPEATABLE READ
- Won't read uncommitted data, and does ensure read rows will not change within the transaction. Put another way, you're essentially reading from the same snapshot each time.
- avoids dirty reads
- avoids nonrepeatable reads (the main point, since it's the name of the thing)
- still allows phantom behaviour (...wider locking may avoid it, without being quite the same as SERIALIZABLE. Depends.)
- usually implemented by placing locks on all data relevant to the query

SERIALIZABLE
- strictest -- no transaction interaction possible at all
- Tends to be noticeably slower, because it is often implemented with zealous locking, or sometimes even by never handling two transactions on the same object(s) at once(verify).
- avoids dirty reads
- avoids nonrepeatable reads
- avoids phantom behaviour

Notes:

Some database engines may offer their own additional levels

e.g. SNAPSHOT[4]

oracle and postgres say serializable when they mean snapshot(verify)

most of these isolations aren't ACID - that's sort of the point.

Most database engines can be asked to behave to different levels, e.g.

MySQL

MyISAM doesn't support transactions at all, so is probably best described as READ UNCOMITTED always

InnoDB supports all four(verify), and defaults to REPEATABLE READ

Postgresql:

its MVCC nature means it internally has the equivalents of READ COMMITTED and SERIALIZABLE.

Default is READ COMITTED(verify)

The other two don't exist, and asking for them actually gives you the next-strictest behaviour:

READ UNCOMMITTED gets you READ COMMITTED

REPEATABLE READ gets you SERIALIZABLE

Note that given its MVCC design, reads tend to spend less time waiting for locks, so looser settings often aren't as useful

DB2 has its own terms and goes into a few more distinctions and may be interesting reading.

Considerations to allowing dirty reads

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Using SET TRANSACTION ISOLATION LEVEL to READ UNCOMMITTED (or e.g. MSSQL's per-query WITH(NOLOCK)), can be used to allow dirty reads to happen, which means the value is read out without locking considerations.

This can be useful when any value currently in there will do. This may be true in the case of social-site profile information, continually updated statistics tables, and such. Any value from now to a few seconds ago will be perfectly good - using locks and listening to transactional accuracy will only really mean more waiting.

Of course, dirty reads may at any time return uncomitted, old, or duplicate data (because you are asking the database to violate ACIDity for the read), meaning you should only do this when you know you'll do no harm, such as when you only read the value to show it and no write/update will even be based on it. (Much the same goes for non-repeatable reads, really)

Note that MVCC-style engines, such as postgresql and MySQL's InnoDB, avoid a lot of classical style locking by isolating data versions, which means there is little speed increase in allowing requests to give you dirty versions.

Serializability, Linearizability

Journaling

CAP

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

There are various formalizing takes on what happens when systems become scaled up and distributed, in terms of guarantees, and also things like bandwidth and job rate and the things that hold them back, and how such details interrelate.

One of the better known ones (which, spoilers, deals with fewer guarantees than you really want, or probably think), is Brewer's CAP theorem[5], which hints at that of the three properties it makes, no matter your answer, you are always choosing some tradeoff.

CAP refers to:

Consistency usually means "do all nodes see the same data at the same time",

Consistency in CAP means is largely about ordering and timing

(CAP consistency mostly describes linearizability. This has little to do with ACID's consistency, which is more about "does it follow all the rules and restraints imposed (typing, relations, bad values, duplicates, etc.)")

partly "once new state is successfully accepted, we never send older state back"

partly agreeing which version is current

partly "we need to put things in the right order to interpret them properly"(verify)

Availability - you will get a response with reasonable time

a response that is data and not an error

but there is no guarantee that it is the most recent data

(note: an answer "regardless of things like node failure" implies that node failure does not halt the system)

Partition tolerance - system continues even with imperfect communication

no, 'partition tolerance' is not the best name for "what if messages go wrong?"

partitioning refers to "parts of a system that don't agree on state"

dealing with issues like

"some message got lost"

"some message arrived out of order"(verify)

"one host fell over due to hardware and then came back" (CAP doesn't mention this, or other real-world faults but you squint you can see this as it having missed a bunch of messages)

and often means the partitioned system needs to be resolved to become one whole again somehow.

...automatically if at all possible

When something partitions, we would like to qualify its behaviour, and CAP names are arguably quite useful to describe their tendencies in terms of the above properties.

But this can easily be misleading -- it makes it really tempting to equivocate the tendency with their more technical, stricter meaning (incorrectly, because any real-world system is going to be due to actual workings and fail in ways you now do not expect).

Some potentially misleading things

CAP is actually a much narrower thing than people think - technically it models a single piece of memory.
- it says nothing about latency, which everyone NoSQL absolutely does care about.
- it doesn't even talk about things that touch multiple objects

let alone a system as a whole

These three are entangled to start with, more so when you add "...within reasonable time" -- which CAP ignores, and you cannot.

The larger the scale and/or the lower the required latency, the more you lean towards accepting you trade away one of these things, in imperfect conditions or at all.

The way we usually explain these to each other, C and A are about properties that might stick around under perfect conditions, P is specifically about reasonable expectations under imperfect condition - partitioning.

which leads to this "trying to strongly guarantee two of these comes at the cost of the ability to guarantee the third so well" view

Some introductions suggest it's a "pick any combination of two". It's really not.

It doesn't become exactly two either. In terms of misbehaviour that you may want, all are a matter of degree.

The partitioning may be an exceptional thing, or may be a regular thing

a system may try and manage to be CAP almost always. If it never partitions, there is no good reason to give up CA in regular operation.

CAP only really considers the fault we call partitioning - and ignores other kinds of faults or how to deal with them (kind of a huge thing to leave up to implementer)

The way Availability is technically defined in CAP is actually not met by a lot of real world systems.

The way Consistency is defined is not the same as ACID's Consistency.

The tendencies mentioned above include:

focus on C+A is tends to be for consistency management, lock management, fastish transactioning

usually doesn't like partitioning - many barely handle re-syncing after a partition.(verify)

focus on C+P will under regular circumstances also provides Availability, but when partitioned does not provide Availability

but often basically says "pause until a partition stops happening, e.g. going read-only or unavailable"

so that it stays deterministic with a simpler model, and more easily rejoinable

there has been research to make that more flexible

can be useful for large things that act as a primary data store(verify)

focus on A+P is probably for speed at scale

under regular circumstances can provides C (or close enough)

during a partition, this will often let parts operate, with less care that nodes's data might go out of sync

to clients fetching things from different parts, it will be inconsistent

whether inconsistencies can be resolved, or whether this is just a hope, depends on a lot of details

BASE

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

BASE is a wink at ACID.

BAsically available - roughly meaning there will be a fairly quick response to all requests -- because that response is allowed to be

known-current state

recent-but-old state (so, when data interrelates, logically inconsistent state).

'failure'

'try again later'

Soft state - you should assume the system might be catching up, even while there is no input.

Eventual consistency - The system will tend towards well-propagated consistency once it stops receiving input.

Eventual consistency

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

For context, RDBMSes do what you could now call immediate consistency in the sense of "if I acknowledge an alteration, it 's there in my store". Though this does not necessarily imply linearizability in the sense of "another operation will pick it up as soon as technically possible", it's usually very soon after, and well defined how soon after and why.

Eventual consistency usually means "I acknowledge I've received that -- I'll get to actually processing it soon."

Which may be very soon after indeed - it may propagate consistency whenever it's not busy receiving input.

In some cases, it is not very important how much later a thing is processed. In these cases, this makes it easier to speed things up and/or scale things up.

Yet even if we don't care about when, we may still care about what happens when we hit an issue.

If we hit a conflict (in the data sense), and we do not have a reasonable way to resolve a particular conflict, the only sensible decision may be to forget the data.

And if the acknowledgement was "I store it in RAM" and not journaled to disk, might be lost in a crash.

The benefit is that we can say yes earlier, not holding up this or other clients. The cost is fewer guarantees.

This might be a very reasonable cost/benefit thing e.g. in things used primarily as a caches , in that you usually don't have any issue, and it's perfectly sensible if updates appear slightly later.

It is less ideal in primary stores, in that while it means data is usually available in the short run, it may only be consistent (enough to not give weird results) in some longer run.

Data consistency and versioning, and its concepts

Contents

ACID

ACIDity of RDBMSes versus NoSQL

Isolation levels

Considerations to allowing dirty reads

Serializability, Linearizability

Journaling

CAP

BASE

Eventual consistency

Navigation menu