Difference between revisions of "Non-relational database notes"

From Helpful
Jump to: navigation, search
m (Chronograf)
m (InfluxDB notes)
(One intermediate revision by the same user not shown)
Line 994: Line 994:
  
 
Seen from a classical relational view...
 
Seen from a classical relational view...
: measurements are like a table
+
* '''database''' is a logical container for
: time is always primary index is always time
+
:: users
 +
:: retention policies
 +
:: continuous queries
 +
:: time series data
  
: tags are key-value pairs (column-like) you can add per datapoint that are  
+
* '''retention policy''' (RP)
 +
:: replication factor (copies kept in the cluster, autogen's default is 1)
 +
:: retention - how long to keep the data (min 1hour, autogen's default is infinite)
 +
:: shard group duration - how much data is stored in shards (min 1hour, autogen's default is 7 days)
 +
:: more practically:
 +
::: the RPs that are defined are part of the database they are created in
 +
::: you get a default called autogen (defaults mentioned above)
 +
::: each measurement has a retention policy {{verify}}
 +
: they're mentioned here because the model makes a measurement part of one RP, which you'll quickly note in addressing (testdb.autogen.measurementname)
 +
 
 +
* '''measurement''' are like a table, containing tags, fields and points
 +
: there is always a primary index on time within a measurement.
 +
 
 +
* series
 +
: basically refers to a (measurement,tag) combination you'd likely use in querying -- see below
 +
 
 +
* '''tags''' are key-value pairs (column-like) you can add per datapoint that are
 
:: part of its uniqueness
 
:: part of its uniqueness
 
:: indexed
 
:: indexed
 +
:: limited to 64K (and you probably don't want to get close without good reason)
  
: field are key-value pairs (column-like) you can add per datapoint that are
+
* '''field''' are key-value pairs (column-like) you can add per datapoint that are
 
:: not part of its uniqueness
 
:: not part of its uniqueness
:: not indexed  
+
:: not indexed
:: Float, Integer, Boolean, Timestamp, String
+
:: float, integer, boolean, timestamp, or string
 +
::: strings possibly not limited to 64K? (I've seen conflicting information)
 +
::: but you probably don't want to use influxdb as a blob store if you want it to stay efficient
  
: series
+
* (data) points
  
: data points
 
  
  
Tags larger than 64K are not allowed (makes sense).
+
'''Typical use of measurements, series, tags'''
Fields mentioned to have a similar limit bu
+
65535
+
:: seems there's no limit on string length, but using it as a blob store is not going to be efficient.
+
  
 +
Say you have CPU data (various tutorials use this example) and are collecting it for various datacenters.
 +
You might have
  
 +
* measurement called '''cpu'''
 +
* tags like '''hostname=node4,datacenter=berlin,country=de'''
 +
* fields like cpu0=88,cpu2=32
  
  
 +
You'll notice this is a pile of everything CPU-related, with structuring common uses kept in mind - hostnames would be unique within a datacenter and can be addressed individually {{comment|(and efficiently because index because tag)}}, but you can also e.g. average per country
  
A point is unique (measurementname,tagset,timestamp), so if you write when a record with that tuple already exists, the fieldsets is merged (overwriting extisting field values).  
+
'''Series''' are basically the idea that each unique combination of (measurement,all_tags) represents a series.
 +
That said, they are arguably more of a ''querying'' concept, and a storage one only insofar that the indexing helps a lot. {{verify}}
 +
 
 +
 
 +
 
 +
'''On point uniqueness'''
 +
 
 +
A point is unique (measurementname,tagset,timestamp), so if you write when a record with that tuple already exists, field values are merged/overwritten.
  
 
The example [https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points here] is CPU metrics, and to put host,region,geography into tags, and measurement values into fields, because that's how you'ld typically query it.
 
The example [https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points here] is CPU metrics, and to put host,region,geography into tags, and measurement values into fields, because that's how you'ld typically query it.
Line 1,027: Line 1,058:
 
Relevant to this is that while timestamps are nanosecond resolution by default{{verify}}, this can be reduced to microsecond, millisecond or second, per database{{verify}}
 
Relevant to this is that while timestamps are nanosecond resolution by default{{verify}}, this can be reduced to microsecond, millisecond or second, per database{{verify}}
  
Aside from merges you may like to happen, lower precision also helps compression.
+
Aside from merges you may like to happen, lower time precision also helps compression.
  
  
The line protocol
+
 
measurement,tag_set field_set timestamp
+
'''On shards and shard groups'''
where
+
 
: tag_set and key_set are comma-separated key=val pairs
+
Under the hood, data is stored in shards,
: timestamp is nanosecond-precision Unix time
+
shards are grouped in shard groups,
:: (optional; defaults to local timestamp, UTC)
+
shard groups are part of retention policies.
 +
 
 +
This is under-the-hood stuff, but it can be useful to consider in that shard groups are related to
 +
: efficiency of the typical query, e.g. when most queries deal with just the last one and the rest is rarely touched, essentially archived
 +
: the granularity with which data is removed (it's dropped in units of shard groups)
  
  
Line 1,055: Line 1,090:
  
  
 +
The line protocol[https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_reference/#line-protocol-syntax] is a text representation that looks like
 +
measurement,tag_set field_set timestamp
 +
where
 +
: tag_set and key_set are comma-separated key=val pairs
 +
: timestamp is nanosecond-precision Unix time
 +
:: (also optional; defaults to local timestamp, UTC, but be aware of clock drift (so use [[NTP]]) and timezones (so maybe don't do do this at all))
  
  
  
 +
'''Security'''
  
 +
* '''Authentication'''  [https://docs.influxdata.com/influxdb/v1.8/administration/authentication_and_authorization/#authentication]
 +
: By default there is no auth
 +
: you may wish to restrict things to localhost
 +
: if you want auth, you need to enable security and create users
 +
: auth happens at HTTP request scope, e.g. for the API and CLI.
 +
: certain service endpoints are not authenticated
  
 +
* HTTPS [https://docs.influxdata.com/influxdb/v1.8/administration/https_setup/]
 +
: you can get server certs and client certs - and use self-signed ones if you wish
 +
: note that in a microservice style setup, you may wish to do this on the edge instead
  
: retention policies (per database)
 
:: replication factor - copies kept in the cluster
 
:: retention - how long to keep the data
 
::: min 1hour
 
:: shard group duration - how much data is stored in shards (by timestamp-interval)
 
::: min 1hour
 
::: It's related to efficiency of the typical query - if you can typically deal with just one
 
::: since there's overhead in each, you want a minimum of data points in each
 
::: and a maximum helps reduce the amount of data that typical use might need to access
 
::: in practice it will regularly be a low fraction of the retention duration, because it will only drop shard groups as a whole
 
:: defaults to infinite retention, replication 1, and 7-day shard group duration
 
  
  
retention policy is part of the series definition (though it has a DEFAULT)
 
  
 +
'''Query languages'''
  
 +
InfluxQL - an SQL-like language
  
 +
Flux - a more featured language
  
  
CREATE DATABASE test
 
USE test
 
  
 
'''InfluxDB line protocol'''[https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_reference/#line-protocol-syntax] is a text representation
 
  
 
See also:
 
See also:
Line 1,090: Line 1,128:
  
 
-->
 
-->
 +
 
====Chronograf notes====
 
====Chronograf notes====
  

Revision as of 20:36, 22 May 2020

For other database related things, see:

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)



NoSQL

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Most broadly, NoSQL usually means "a data store that chooses a different specialization from one of the top five RDBMSes".


Some common properties of NoSQL:

  • specializing to something that just doesn't fit a relational model very efficiently without a lot of extra work, like graphs, timeseries, arguably things like fulltext search


  • storing data which may be relational, but usually not queried as such, in particular...
  • distancing from schema'd data models
e.g. for the reason that in an RDMBS you would have to bake in such structure, and in schemaless you can be more flexible
note that schemaless often just means implied app-managed schema, and more than one
...gets you started faster
...only keeps working if your data requirements are simple, and if referential integrity is not that important, and usually you want to migrate data along with the schema
effectively puts any and all validation, and change validation, on the app, rather than the database.
Which is what makes it easier to change your effective schema (in large RDBMSes schema changes would often involve hour-long locks)
but also potentially harder to keep it valid over a longer term - your code needs to either migrate data, or know about many versions
So to some degree it's a "know the rules to know when and how to break them" thing


  • Code(rs) having more flexibility - and more responsibility.
You have to think hard about data modelling, you have to think hard about schema changes, and making sure all client behaviour cooperates.
This is true for any design, yes. Now it's just spread over more time and more parts of your app


  • no transactions (usually)
in part, the last two come from the fact they are damn hard to handle when you care about scaling so much (but not impossible!)
  • no referential integrity (for similar reasons)
  • doing more things without join (often by design)
assumes you do not typically want to resolve all references
often faster when you don't
often slower when you actually did (so if your data inherently fits the relational model better than all others, then an RDBMS is still the best choice)
  • often stores a denormalized form
This is often faster to fetch (nice)
also easily means mean conflicting data, and duplication


  • (often hidden) CAP-style decisions
a big argument in itself -- e.g. the point that various software steps away from guarantees much too easily
Often eventual consistency instead of immediate/strict transactional consistency, particularly when they do replication/sharding.


Some of this argues that a lot of NoSQL is often better at a distributed cache, bit sometimes worse as your primary store.


See also:


On database types

Relational

The old standby.

Highly structured, schema'd, with things like optional relational integrity.

Which are features that are important when you want things highly controlled and highly verified, but also fundamentally back the ability to scale.


Relational databases are still best at consistency management, better than most NoSQL. NoSQL typically scales better, though many still have hairy bits (even flaws) in their consistency management.


Key-value stores

You ask for a value for a given key. Typically no structure to the data other than your interpretation after fetching it

...so they lie very close to object stores and blob stores (the last basically file stores without filesystem semantics).


When this fits your data and use, these are often low-bother ways to store a whole bunch of information, often with some corruption recovery.


If you have something in a RDBMS where you are actually mostly retrieving by primary key, and doing few or no joins (which may include things like simple data logging), you could use one of these instead.


Disk-backed or not Various file-based stores (see e.g. File database notes) are effectively disk-based key-value stores.

Since they are often indexed, and cached in some way, you may easily think of them as hashmaps that happen to be large and stored on disk.

Used more than you may think; Berkeley DB is used in many embedded setups.

There has been some revival in this area. For example, Tokyo Cabinet is basically a modern, faster reimplementation of the dbm. (with some extensions, e.g. Tokyo Tyrant to have it be networked, Tokyo Dystopia to add full-text search).


When not disk-backed, they are effectively in-memory caches (e.g. memcached), and sometimes also useful as message broker (e.g. redis).

Document store

key-value where the value is structured data, often not following a strict schema.

It is also frequently possible to index on these fields


Often presented as JSON (or XML, though XML databases can be considered a specific type of document store).

In contrast with e.g. relational data, documents are often altered individually, delivered as-is, and not heavily linked.


https://en.wikipedia.org/wiki/Document-oriented_database

Column store

Wide column store

Timeseries

Graph

Search engine

While generally considered an index based on a primary data store elsewhere, that also makes that searchable index the thing you actually use.

Plus there are projects that do both.

Storagey stuff - kv, document, and bigtable style

riak notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Key-value store with a focus on concurrency and fault tolerance,

Pluggable backends, e.g. allowing use as just a memcache, or give it persistence.

Eventually consistent (with some strong-consistency experiments?(verify))

masterless

Ideally you fix the cluster size ahead of time. When you add nodes, contents are redistributed (verify)


Backends include

  • bitcask
all keys in hashtable in RAM (fast, but limiting the amount of items via available RAM)
file copy = hot backup (verify)
  • leveldb
keys stored on-disk
secondary indexes, so limited limited relational-style querying at decent performance
data compression
no hot backup
  • innostore
  • memory
objects in ram

It aims to distribute perfectly (and supports some other features by assumptions), which implies you have to fix your cluster size ahead of time


Pluggable backends mean you can have it persist (default) or effectively be a distributed memcache


etcd

Distributed key-value store, related to CoreOS and Kubernetes

Cassandra

MongoDB notes

tl;dr

  • Weakly typed, document-oriented store
retrieved documents are maps
values can be lists
values cam be embedded documents (maps)
  • searchable
on is fields, dynamic
e.g.
db.collname.find({author:"mike"}).sort({date:-1}).limit(10)
- which together specifies a single query operation (e.g. it always sorts before limiting [1])
supportable with indices [2]
field indexes - basic index
compound indexes - indexes a combination, e.g. first looking for a userid, then something per-userid)
multikey indexes - allows matching by one of the values for a field
2d geospatial - 'within radius', basically
text search
indexes can be:
hash index - equality only, rather than the default sorted index (note: doesn't work on multi-key)
partial index - only index documents matching a filter
sparse index - only index documents having that have the field


  • sharding, replication, and combination
replication is like master/slave w/failover, plus when the primary leaves a new primary gets elected. If it comes back it becomes a secondary to the new primary.
  • attach binary blobs
exact handling depends on your driver[3]
note: for storage of files that may be over 16MB, consider GridFS


  • Protocol/format is binary (BSON[4]) (as is the actual storage(verify))
sort of like JSON, but binary, and has some extra things (like a date type)
  • Not the fastest NoSQL variant in a bare-metal sense, but often a good functionality/scalability tradeoff
e.g. for various nontrivial queries
  • no transactions, but there are e.g. atomic update modifiers ("update this bunch of things at once")




CouchDB notes

(not to be confused with couchbase)


Document store with a REST-like interface.

Meant to be compatible with memcachedb, but with persistence.


  • structured documents (schemaless)
can attach binary blobs to documents
  • RESTful HTTP/JSON API (to write, query)
so you could do with little or no middle-end (you'll need some client-side rendering)
  • shards its data
  • eventually consistent
  • ACIDity per doment operation (not larger, so inherently relational data)
no foreign keys, no transactions
  • MapReduce
  • Views
best fit for mapreduce tasks
  • Replication
because it's distributed, it's an eventually consistent thing - you have no guarantee of delivery, update order, or timeliness
which is nice for merging updated made remotely/offline (e.g. useful for mobile things)
and don't use it as a message queue, or other things where you want these guarantees
  • revisions
for acidity and conflict resolution, not in a store-forever way.
An update will conflict if someone did an update based on the same version -- as it should.
  • Couchapps,


document ~= row


Notes:

  • view group = process
nice way to scale
  • sharding is a bit harder


Attachments

  • not in views
  • if large, consider CDNs, a simpler nosql key-val store, etc.

See also:


PouchDB

Javascript analogue to CouchDB.

Made in part to allow storage in the browser while offline, and push it to CouchDB later, with minimal translation.

Couchbase notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

(previously known as Membase) (not to be confused with CouchDB)


CouchDB-like document store, plus a memcached-compatible interface


Differences to CouchDB include:

  • typing
  • optional immediate consistency for individual operations
  • allows LDAP auth
  • declarative query language
  • stronger consistency design


MonetDB

Column store

https://www.monetdb.org/Documentation/Manuals/MonetDB/Architecture


hyperdex

RethinkDB

Document store with a push mechanism, to allow easier/better real-timeness than continuous polling/querying.


Hbase

An implementation imitating Google's Bigtable, part of Hadoop family (and built on top of HDFS).


See also:


Hypertable

See also:

Accumulo

https://accumulo.apache.org/

Storagey stuff - graph style

This one is mostly about the way you model your data, and the operations you can do, and do with fair efficienct.

..in that you can use e.g. key-value stores in a graph-like ways, and when you don't use the fancier features, the two may functionally be hard to tell apart.


ArangoDB

https://www.arangodb.com/


Apache Giraph

http://giraph.apache.org/


Neo4j

OrientDB

Cachey stuff

redis notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


tl;dr:

  • typed key-value store
that you may also want to do some queries/operations on
  • data structures: counter, list, set, sorted set (hash-based), hash, bitarray
    • operations on those
  • supports sharding [5]
  • allows transactions, which lets you do your own atomic updates when necessary
  • pub/sub


Best used for cases where you primarily need to accessing and update structured data at scale (also mixes with just lookups, like memcached).

Not really made for finding items by anything other than their id (look at things like mongo instead?) ...but for simple cases, creating an index you always update yourself can be manageable (and is typically fast).


memcached notes

These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

memcached is a networked, in-memory key-value cache LRU-style.

Its common use is probably caching data that was complex and/or high-latency to generate, and/or not very volatile.


Clients can talk to sets of servers (a client/protocol feature), which means that many clients can distribute values, and share a distributed cache (without platform-specific IPC stuff).

There is no access control; firewalls should be enough.


It

  • was made to have nothing that could respond slowly - all in memory and no going to disk, no locks, no complex queries (it's mainly an hashmap), no wildcard queries, no list-all.
  • was made to never block
  • was made to keep the most-used data:
it throws away data based on expiration timeouts and
when full, based on LRU (Least Recently Used) logic.


It is not:

  • storage. It's not backed by disk.
  • redundant. You are probably looking for a distributed filesystem if you are expecting that. (you can look at memcachedb and MogileFS, and there are many others)
  • a document store. Keys are limited to 250 characters and values to 1MB. (again: look at distributed filesystems, distributed data stores)
  • a transparent database proxy. You have to do the work of figuring what to cache, how to handle dependencies and invalidations


Originally developed for livejournal (by Danga Interactive) and released under a BSD-style license.


Daemon options

The main command line options:

-d            daemon
-m 2048       take up to 2048MB of memory
-l 10.0.0.40  bind to this IP 
-p 11211      ...and this port

default when unspecified is 64MB, which may be too conservative, so distros tend to set something larger already.

(On 32-bit machines, you cannot give a single process more than, usually, 3GB or 2GB of memory (see 4GB of memory on a 32-bit machine). You can run multiple daemons, though.)


Some client, server, and interaction details

A single server is mostly just a simple and fast hashmap.


When given multiple servers, clients can choose to distribute objects among them (based on the hash, you may also get control of the hash so that you can ensure related objects are stored on the same server. Note that you'd want to do that consistently on all clients).

The client effectively adds another layer of hashing, in that it chooses the server to store an object on based on its hash and the available servers.

For this reason, for optimum cache hits, all clients should use the same client list, and use the same hashing method. It helps to have each be the same client implementation (also because some may have transparent serialization, that may not be compatible between everything you have).



Client interfaces

There are APIs/clients for most major languages (see [6], [7], [8]), and you can implement your own reading the the protocol.


Exactly what the interface takes and returns varies. It may be dictionaries, persisted objects, bytestrings, etc.


Basic usage notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Retrieval:

  • get: note that you can ask for multiple entries in a single request
  • gets fetches more information, including the the value needed for the cas operation (see below)

Storage commands are:

  • set: creates entry if necessary
  • replace: like set, but only stores if the key already existed
  • add: like set, but only stores if they key was not already present exist
  • append: append to current cached value (only if it already existed)
  • prepend: prepend to current cached value (only if it already existed)
  • incr and decr increment and decrement a 64-bit integer. Entry must already exist (so e.g. set(key,'0') first). Interprets a non-integer value as 0.


  • cas: 'check-and-set': store, but only if no one else has updated it

You should write your interaction to minimize the amount of round trips (i.e. amount of commands)


(verify)

get_multi: fetch for many keys at once. Avoids latency overhead from doing multiple requests

flush_all: Clears the cache. Useful when developing, since you don't get to list items to clear.


The time is interpreted either as

  • if <2592000 (which is 30 days in seconds): a delta time from the current server time
  • if larger than that: Unix (seconds-since-epoch) time


Undocumented/unofficial/debug features

Designing access models; Tricks

Rules of thumb:

  • Tackle the most obvious cases first. Usage probably follows 90-10 patterns. You can leave this to the LRU-ness of the cache, but in some cases you can avoid a bulk of nonsense that has to be managed from entering the cache.
  • Aside from the obvious networking and management costs, also consider serialization (marshalling) costs.


Things to consider:

  • Your setup may count on touching cache elements, but badly designed setups may mean a lot of touches per page view (or other overall product), that bottleneck your access to memcached (there are a few different ways to reduce touches)
  • It can help to layer your cache a little more. For example, fragments of pages may be constant, and could be cached. Some of this can also be cached in whatever end process you have, to lighten the load on memcached for things that have fairly simple/obvious/static use cases.


  • You may want to use cacti/munin/some other logging/graphing on certain stats while you are developing, to see both long-term patterns, and you may see some some obvious mistakes in, say, relative amounts of gets/sets this way.
  • You can't really control treatment of subsets of elements. That is, you can't say that certain elements should always be removed first. When you are using memcached for small-scale app caching, and not for application scaling, it may be useful to set up multiple daemons, to set up separate treatment per cache. (this does waste memory, but also note that you can easily set limits on the amount of memory to be used for each namespace this way)



(Faking) bulk invalidation

A situation where you don't know the exact set of keys you want to invalidate, but do have a pattern to remove, e.g. a prefix.


This doesn't exist directly, because this means a potentially slow wildcard query, and memcached was designed to only have queries that are always fast.


One way of working around this is to put a version in your key (this gets called namespacing, sounds fancier)

In other words, instead of removing, you're shifting to a new set of keys
the old ones are left in there, and will be pushed out by the LRU logic soon enough
keep in mind that this doesn't alter the store, it's just a different view from the/each client. Other clients will follow only if you either
make the same decision in all clients (can be annoying)
put a "currently applicable version number" in the memcache too (and e.g. have clients fetch it every second or so), so that you can tell other clients to move on to new keys.

Technical notes

See also

Searchy stuff

Message brokers / queues

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

See message broker notes


Time seriesy

Time series databases are often used to show near-realtime graphs of things that happen, while also being archives.

They are aimed at being efficient at range queries,

and often have functionality that helps


InfluxDB

Is now part of a larger stack:

Telegraf[9] - agent used to ease collecting metrics. Some pluggable input / aggregation/ processing things
InfluxDB[10] - time series database
Chronograf[11] - dashboard.
also some interface fto Kapacitor, e.g. for alerts
Often compared to Grafana. Initially simpler than that, but more similar now
Kapacitor[12] - streams/batch processing on the server side

Flux[13] refers to a query language used in some places.


InfluxDB is distributed, an uses distributed consensus to stay synced.

Open-source, though note some features (like distribution) are Enterprise-only.


InfluxDB notes

Chronograf notes

TimescaleDB

OpenTSDB

Graphite

See Data_logging_and_graphing#Graphite_notes


Unsorted

Openstack projects related to storage:

  • SWIFT - Object Store. Distributed, eventually consistent
  • CINDER - Block Storage
  • MANILA - Shared Filesystems
  • KARBOR - Application Data Protection as a Service
  • FREEZER - Backup, Restore, and Disaster Recovery

See also