Container and docker notes

From Helpful
Jump to: navigation, search
Notes related to (mostly whole-computer) virtualization, emulation and simulation.

Virtualization, emulation, simulation · Docker notes · Qemu notes

Contents

Containers intro

Containers are processes that run directly in the host linux kernel, that just happen happen to be isolated in all ways that matter.

They are not they are not emulation, they are not VMs, not even a hypervisor style VM.

Containers work since roughly linux kernel 3.13, because it leans heavily on recent kernel features like namespaces. Also cgroups are relevant.


This is more lightweight than running a classical VM, mostly because a classical VM virtualizes the hardware (while, roughly, docker virtualizes only the OS)

meaning VMs have to boot an OS. Docker nodes not, it's immediately ready to start the processes it should run.
meaning VMs require you to exclusively reserve things like RAM and disk space beforehand. In docker you're being allocated RAM and disk from the host OS. You probably still shouldn't overcommit, though.


There are comparable things, e.g. SmartOS had been doing this before Linux, though in that case for reasons of operational efficiency and security, whereas docker seems more focused at deploying microservice style things (one or a few processes).


What does docker add to the concept of linux containers?

Containers are the things that run.


Containers are a more general concept, and docker is software that builds on top of containers - and in the end just one of them.

There are even entirely different takes on the supporting kernel features - like LXC, running what inside looks like more of an OS.

For some more background and comparison, see linux containers.


Docker is just one way of making containers.

For a long time, it was the main one, and its stack defined the way things were typically done.

Today, however, things are... more flexible and more confusing.



What are containers useful for?

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Depends on who you ask. Angles include:


Portability

The isolation docker presents means each container app will run the same, regardless of your linux variant, OS's libraries, hardware, and such.

Like a VM does, but more lightweight than a VM.

Like self-contained applications, like portable software or app images or other names like that, but but in a perhaps cleaner way.

Basically, "it will run identically in any place that it runs at all". Which can be useful, for multiple reasons.



Stable development environment, optionally automated deployment

As an extension of the above, you can code in a way that reduces the "well, it builds/works on my host, why not for you/on the servers?" issues.

Yes, there are other ways to isolate a build environment, or describe one fully enough.

No, docker won't do that by itself - an in fact doing this properly is entirely on you and good habits - but it makes it a bunch simpler.


Also consider that when you do building, testing, and deployment, it can make sense to do all three the same way.

It's not too hard to set up to automatically triggers: testing, building an image, and starting it on a dev server (often based on a push to a code versioning system, because that's a convenient way of doing it. But also because it is not a software build system, e.g. changes in code do not invalidate the build cache).

Easy and fast container startup is a nice detail to automated testing of a new version, both of the unit/regression sort, as of some amount of larger deployment. But also valid on just a single host.



Large-scale deployment

In clusters and swarms it's very handy to have exactly reproducable software environments.

With classical clusters you end up with relatively fine management of each node, and care to keep them consistent.
With containers you basically don't have to think about that at all (beyond the communication, but that's always going to be there).
and because we've separated system environment from app environment, admins can now update the host OS without having to worry, every time, whether that might break some of or all the app due to some random library dependency.


Separating apps to relieve dependency hell

A long time ago, executables were often monolithic things that controlled everything - but was too close to the hardware. Modern systems have applications live in an ecosystem of tens or hundreds of libraries, which keep working by merit of that environment staying sane.

The larger said ecosystem, the more friction and fragility there is between versions for different apps, and harder to administer, e.g. more chance for "oh god I tried to update X and now everything is in sorta-broken state that I don't understand and cannot easily revert either.".


Arguably, mixing the ecosystem for the basic system is a good idea in the long run, because that means people are forced to make a system that keeps that clean.

Doing so is a reason why linux is generally quite good at this, with how sofiles work and allow multiple installed versions and such.

But these there are multiple layers on top that which, despite being newer, have forgotten how to do that. So now, the more things you have installed, the more likely it is there is simply not a solution for your dependency problem, and the only solution is to isolated them not only from the OS, but also from each other.

And docker (or VMs) is one option to keep all the other idiots away from your software environment. Docker is arguably overkill, and can be more bothersome (with the details of breaking out X, audio, the GPU, and such) - you may wish to look at snap instead (although I'm still not sure what to think about it)


Sane admin on multitenant servers

If you have servers with multiple users with conflicting needs in terms of installed packages, you as an admin will have a better time if you just give them each their own docker container.

Because any local access is something will you have configured.

When is it less useful?

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

When there is no reason to use it.

because adding complexity without reason is never a good idea
and distributed software is, by nature, harder to design and a lot harder to debug than a monolithic setup.
also, it takes some relearning on how to do a lot of security things (e.g. least-privilege access) and admin things (e.g. restorable backups, monitoring). If there's no other benefit, this is just lost time.


If you think it's the hip scalability band-aid

Actual efficiency, and scalability, are properties of your design
Any container/VM/management makes the deployment step potentially easier, none guarantee it


If security is your main interest

sure you get isolation by default, and that's great, but security was never the primary goal, and there are a handful footnotes you must know
VMs are arguably better there, at least until security becomes a central focus for containers and docker
whether security updates are easier or harder to do around docker depends entirely on whether you thought about it before you started


When one instance is good enough

Splitting things into more parts means more networking, more config, more thought about connectivity watchdogs, reconnection, or risk introducing failure modes that basically weren't there on a single host.


When you stop at docker

Some potential upsides, like automated setup and provisioning, aren't part of docker but part of what you wrap around it
and only really happen when you think about how to do them for your case, and it has to make sense for your use case.



Arguables and criticism

Another layer of package management

People use dockerfiles as a flexible way to install software. As some people put it, "docker is to apt what apt is to tar."


This is a double-edged thing.

  • This is sometimes a nicely brief way of doing it, in that you can describe a well controlled and immediately buildable environment in one shortish dockerfile.


  • the fact that docker's build system is very basic (simpler than the fifty-year-old Make system) lets people be particularly bad at at guaranteeing reproducible builds
This is mostly due to people not thinking. Which isn't docker's fault at all, but it does mean its ecosystem is a hot mess, and has no reason to change.
but e.g. having most of your dependencies be external that you know change over time (apt repositories), and never using versions of packages, is exceedingly common
(The area I work in may be the worst for this and it may be better elsewhere, but I found most dockerfiles will not build, most commonly because a few months to a few years later later there is no solution for the same package tree, or the solution is actually incompatible with the actual software, so even if they build fine they break at execution time. I've also found plenty of binary builds that fail to do their main thing, apparently because it's a lot easier to do automatic builds than to test them properly.)



The cynical view is that

not only have we collectively given up on proper dependency management and now build 2GB app images of everything to wrap one 100KB executable,
we are not great at making those builds stable over time
and relying on binary images makes it even harder to do security updates than just installing the thing directly.
and while these base images could overlap in theory, they generally don't in practice. This isn't the thing that makes you go broke in server fees, but still.

Because we never solved the problem we had, we just pushed it around.


And don't get me started on the unholy mess that is trying to get GPU stuff to work, with dockerfiles often breaking on software's version fragility (or broken URLs, because there was no package), or breaking because people managed to make them binary builds that fail on other GPUs - plus we have little control over the proprietary nvidia stuff bolted onto docker so if five years later it won't run, it may just never run again.


So in multiple ways we are recreating dependency hell while actively pretending to solve it. Sigh.

Single-purpose

Good and bad ideas

What's this about runtimes?

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


In this context, runtimes are things you can run containers in.


An OCI (Open Container Initiative) compliant container runtime is one that can run an OCI compliant image.

OCI compliant runtimes include

runC / containerd
runC is mostly the seccomp, selinux, or apparmor for syscall filtering
containerd
wraps runC with a network API (containerd does a fork-exec of runC)
manages container lifecycle (image transfer/pull/push, supervision, networking, etc)
and is what makes this setup OCI compliant (verify)
docker daemon
dockerd
cri-o (as in Kubernetes, so google)
runs OCI compliant images (the -O stands for OCI)
CRI ('container runtime interface') in general was a move within kubernetes to allow mixing runtimes - it previously supported more than one but only one of them in any installation. CRI(-O?(verify)) means you can mix.
defaults to runc runtime(verify)
People made a kerfuffle when "Kubernetes dropped docker support" in 2020 but it just meant "we no longer use dockerd as a runtime, because running the same images via CRI-O instead makes things simpler" (it also means they can drop dockershim, the wrapper they had to keep around the ever-changing docker details)


rkt

CoreOS
on top of runC
not OCI?


gVisor - runC replacement

google
designed for more security - basically kernel API implemented in userspace?
https://www.usenix.org/system/files/hotcloud19-paper-young.pdf


nabla containers

IBM
doesn't run linux containers?
similar security approach to gVisor


kata containers

intel
contain kernel, so more of a VM, and slower startup
upsides are easier use in certain virtualised environments? (verify)
OCI compliant



podman, buildx, kaniko


https://medium.com/@alenkacz/whats-the-difference-between-runc-containerd-docker-3fc8f79d4d6e

https://www.ianlewis.org/en/container-runtimes-part-1-introduction-container-r

https://gist.github.com/miguelmota/8082507590d55c400c5dc520a43e14a1

Docker, more specifically

Some concepts

An image file is a completely environment that can be instantiated (run).

It amounts to a snapshot of a filesystem.
images are often layered on top of other images
which makes it easier to build (you can cache lower layers), and makes it easier to version-control each layer
and since layers are references (to things you also have) rather than copies, it makes heavily layered images smaller
For example's sake you can rely on the default fetching of image files from Docker Hub.
(you can create image files yourself, and will eventually want to)


A container is an instance of an image file - either running, or stopped and not cleaned up yet.

Note that the changes to files from the image are not written to that image. They are copy-on-write to container-specific state.



Images have IDs, containers have IDs.

They show up in most tools as large hexadecimal strings which (actually a smallish part of a larger sha256 hash, see --no-trunc for the full thing).
Note that you only need to type as many characters to make it unique within what you have (which may frequently be one or two characters)


IDs aren't ideal for larger-scale management, so

  • for images you often want aliases to images, e.g. those used by repositories
  • For containers there are names. The automatically generated ones (look like unruffled_curran) are meant to make it easier for humans to communicate about them (than a hexadecimal numbers are).
You can give them your own meaningful names -- but note they must be unique (so at scale you need some scheme)


#More on tags below, which matters when you make your own repository.



A registry is a particular site that hosts images - defaults to docker hub, and is the place where names resolve, letting you do:

docker pull bitnami/rabbitmq



Introduction by example

Install docker

Doing so via package management often does 90% of the setup work.


You may want to give a specific users extra rights, so that you don't need to do things as root (or via sudo). ...but for a quick first playing around the latter's fine, and I'm doing it in the examples below.


Instantiating things

As a test:

root@host# docker run -i -t ubuntu /bin/bash
root@3b66e09b0fa2:/# ps faux
root         1  0.0  0.0  18164  1984 ?        Ss   12:09   0:00 bash
root        15  0.0  0.0  15560  1104 ?        R+   12:11   0:00 ps faux

What happened:

  • it found an image called ubuntu (downloaded it from docker hub if not present yet)
  • instantiated the image to a container, using its /bin/bash as entry point (main process), and that 3b66e09b0fa2 is the hostname
  • we manually ran ps within it
...to demonstrate that the only processes inside right then are that shell and that command


Notes:

  • The entry point is the main process, and also determines the container lifetime: once that quits, the container stops
You would generally use an independent, long-running command. Exactly what is up to you.
It is also valid to see a container as half of an OS, with a stack of software, plus perhaps a ssh daemon, and sometimes that makes sense.
Yet in many cases it is cleaner to go as minimal as you can.
In the microservice philosophy often the actual service
If there's multiple things that must run inside, you would often have it be a process monitor
note that the main process's stdout will go to the logs
note:
docker kill
kills the main process, and thereby the container.
In this example the main process is bash, purely as a "prove it's a separate thing" example and means the container only lives until you log out.
also note that
-i -t
, for 'interactive' and 'allocte a tty' are only necessary because we want an interactive shell, which is not typical
  • If you didn't have the
    ubuntu
    image, then ubuntu:latest would have been downloaded from docker hub.
  • by default, the container id also becomes its hostname
  • the full container id is long. Most places where docker prints one or wants one needs only a few bytes (docker usually shows six bytes, twelve hex characters), because that's almost always unique


Status and cleanup

Containers

See running container instances on the host:

  • docker ps
    to list running containers
  • docker ps -a
    to list running and old containers
CONTAINER ID     IMAGE            COMMAND       CREATED          STATUS           PORTS      NAMES
3b66e09b0fa2     ubuntu:latest    "bash"        9 seconds ago    Up 8 seconds                drunk_mestorf

Once you're done with a container

In testing, you often run containers in a way that means their state sticks around when they stop, in case there is something you want to figure out (in production, you will often be logging important stuff elsewhere, and can often use --rm, meaning all their state is removed when they stop).

So you would want to clean up stopped containers, either specifically

docker rm containerid

or all stopped containers:

docker container prune


Further shell-fu:

  • Stop all containers
docker stop $(docker ps -a -q) 



Images

See existing images on the host:

  • docker images
    lists present images
  • docker images -a
    includes non-tagged ones,
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
<none>                <none>              eda5290e904b        2 months ago        70.5MB
ubuntu                bionic-20190515     7698f282e524        3 months ago        69.9MB


If you're done with images:

  • remove image by id (or tag), e.g. based on an entry from
    docker images
    .
docker rmi ids
It'll refuse when something currently depends on it


Building images builds a lot of intermediates that will be cached. Once you start building images, you can clean those up by cleaning all dangling images (meaning any that you have not given tags)

docker image prune


Further shell-fu:

  • bulk remove images via tag wildcard: consider things like
docker rmi $(docker images --filter=reference="user/spider:22*" -q)
note that not thinking may mean you do docker rmi $(docker images -a -q), i.e. remove all (unused) images


Wider

You may like
docker system prune
, which is roughly equivalent[1] to all of:
docker container prune
docker image prune
(with optional --all, see above)

and also

docker network prune
docker volume prune
(if you add --volumes)


Image building example

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Dockerfile example

A Dockerfile is a sequence of actions, that each creates a new layer


There is obviously a lot more to Dockerfiles and building, but to give an idea of how unassuming builds could be, consider:


FROM phusion/baseimage:master-386
# Which is ubuntu-based, see https://github.com/phusion/baseimage-docker
 
RUN apt-get update                                                                                                                  
 
RUN apt-get install -y apache2 libapache2-mod-php
 
ADD my_software.tgz /opt
RUN echo 'export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/my_software/lib' >> /etc/profile
RUN echo 'export PATH=${PATH}:/opt/my_software/bin' >> /etc/profile
 
 
# The point of this example is software intended to be interactive via SSH -- WHICH IS ATYPICAL 
# Enable SSH  (specific to phusion/baseimage) 
RUN rm -f /etc/service/sshd/down
# generate SSH keys (so it'll changes every build. leaving this out would do it each boot
RUN /etc/my_init.d/00_regen_ssh_host_keys.sh
 
# baseimage's init
CMD ["/sbin/my_init"]
 
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Given

  • you have a subdirectory called
    example_dir/
  • the above content is in a file called
    example_dir/Dockerfile
  • then you can build like
    docker build example -t example_imagename
if successful it's runnable e.g. like docker run -t -i example_name:latest /bin/bash

Compose

How does (x) work?

On container communication

Networking types

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Notes:

  • the --add-host argument to docker run can be useful to augment the /etc/hosts inside.


See also:


bridge

When you do not specify --network, a container is connected to the default bridge

This looks something like:

docker0 on the host side
with a subnet like 172.17/16 (by default, can be changed)
each container gets
a host-side interface (like veth8d404d9)
bridged onto that docker0
a container-side interface (like eth0)
an IP on its subnet (via docker's internal DHCP(verify))


This means all containers can communicate, but only by IP. Also to the host-side interface, and assuming IP forwarding is on also out of the host, because 172.17.0.1 is the default gateway inside.


While it's nice to have a default that just works, the default bridge is a now considered legacy and is not recommended, for mostly pragmatic reasons.

User-defined bridge networks are mostly the same idea, but referring to a specific network (by name). This lets you create one for each set of related containers.

(Containers can be members of several)

It also adds a built-in DNS server (at 127.0.0.11), letting containers resolve each other by name, or alias, relieving you of a bunch of manual configuration / hardcoding burden.

docker network create docker network rm docker network disconnect docker network connect ...though


host

container gets the hosts's stack, and interfaces
port mapping (-p, -P) does not apply
hostname is set from the host
has its own UTS namespace so you can change its hostname without affecting the host
not very secure, so generally not what you want for isolation
but better performance than bridge
so can be useful for local apps
not supported on osx or windows

overlay

distributed network among docker-daemon hosts - swarms
containers can be members of several


macvlan

https://docs.docker.com/network/macvlan/


container

none

No networking beyond loopback

https://docs.docker.com/network/none/


Exposing to host

On name resolution

Resource management

How does storage work?

Container state

Bind mounts, data volumes

On permissions

VOLUME in Dockerfiles

Databases

Limiting CPU, memory

Varied advice

On image building

There are two basic ways to build an image:

  • manually: start with something close to what you want, make the changes you want
saving this container = saving all filesystem changes within it
good for a one-time quick fix, less ideal for some habits you'll need at larger scale
  • automate: write a dockerfile
docker build creates an image from a dockerfile - basically from a series of commands
faster to transfer, since images are cached
https://docs.docker.com/engine/reference/builder/


The hard way: manually

Say we want to build on the ubuntu image to make a web server image

root@host# docker run -i -t ubuntu /bin/bash
root@3b66e09b0fa2:/# apt-get install apache2
# ...and whatever else

To verify the difference to the original image, we can do (note the hash id will be different for you):

root@host# docker diff 3b66e

Which'll print ~1200 filenames. Fine, let's roll that into a new image:

root@host# docker commit 3b66e my_ubuntu_apache
cbbb61030ba24dda25f2cb27af41cc7a96a5ad9d23ef2bb500e9eaa2e16aa44d

which now exists as an image:

root@host# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
my_ubuntu_apache    latest              cbbb61030ba2        About a minute ago   202.7 MB
ubuntu              latest              b7cf8f0d9e82        2 weeks ago          188.3 MB

Notes:

  • commits are to your local registry. (Don't worry, you won't accidentally write on docker hub (or elsewhere) until you are registered, you are linked, and do a docker push)
  • a commit is actually more like a snapshot / layer
  • if you want things reusable and configurable, there are various good practices about how configuration and environment is best handled. You'll read up when you need it.


You can do it this way, but for complex and/or reproducible things it's nicer to automate this:


The usual way: Dockerfiles

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Dockerfiles are a recipe-like sequence of operations, with cacheing of what it knows it's already done (but no dependencies, so not so much like makefiles).

Dockerfiles have a context -- basically acomplete set of files you refer to (The point of contexts is basically that docker is networked, meaning the daemon need not be running on the same host as the client, so this set of file needs to be well defined so they can be transferrable somehow)


You build an image either like

docker build directory

OR

docker build URL

When you give it a directory, the context is the Dockerfile and other files in that directory.

In the URL case it can be

  • a plain text file as Dockerfile (there is no further context)
  • a pre-packaged tarball context (fetched, uncompressed, then fed as a directory), so reducing to the directory case
  • a Git repository - basically fetched and then fed in as a directory, the Dockerfile being part of that


The only required part is the Dockerfile, futher contents just make builds easier (and self-contained things, rather than something you'd have to curl/wget from a place that needs to be available).

When using files from the context, use relative paths.



multi-stage builds

You can refer to earlier stages by number

FROM fork:v2

# compiley stuff here

FROM spoon
COPY --from=0 /foo/bar


It may be clearer to name them, though:

FROM fork:v2 AS fork

# compiley stuff here

FROM spoon
COPY --from=fork /foo/bar /foo


It's an alternative to another approach people have asked for, namely to allow dockerfiles to include other dockerfiles.

This is conceptually simpler in terms of building related images, though would make it a more complex build system that people would probably make a mess of.



https://docs.docker.com/develop/develop-images/multistage-build/

-->

Practical notes

general habits and tips

Some common bits

DEBIAN_FRONTEND=noninteractive

...On debian / ubuntu. The options to this are readline, dialog/whiptail, or noninteractive, and the last is interesting for automatic installs in that it will never pause on a question.

As-is it will often choose a hopefully-sensible default where possible (e.g. UTC timezone).


If you want non-defaults, or things that are site-specific, or answering the occasional thing that really is a freeer-form field, you would use the above and store these values store them in the debconf database. Look around for guides with commands involving debconf-set-selections


see
man debconf
and
man debconf-set-selections
for more details


apt --no-install-recommends

tl;dr: Recommended packages are the things you probably want installed too. By saying "no, I'll do that explicitly", you can often remove a few things and slim down the image somewhat.


Recommended packages are those that you often probably want too. Weaker than Depends, stronger than Suggests. It's a gray area, but recommendations are the things most people probably actually want, for a thing to live up to expectations.

By default, recommended packages are installed. Disabling that effectively treats them as suggestions.


Examples:

gcc recommends libc-dev, and suggests things like autoconf, augomake, flex, bison,
a media player might recommend the ability to play some less-common media sources, and suggest some rarer ones,
TeX may recommend a bunch of modern packages that most people use now,
a -dev package may well recommend the according -doc package,
things in general may recommend or suggest some contrib stuff.

To get more of an idea, try things like:

apt-cache depends gcc

Minimal images

Build systems?

Docker is something of a build system itself, in that you can generate consistent build environments, and a pretty succinct one at that.


At the same time, dockerfiles usually rely on other build systems, and don't think about versioning, so a lot of dockerfiles will not build years after.

Dammit, people, you just re-invented DLL hell while pretending to fix it.


Sooo. Mixed.

Complex with and without: Versions and config

The microservice idea

"avoid sshd"

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

There is nothing inherently considered-evil about sshd.

Yet there are still good reasons to avoid it. It comes down to cost/benefit.


Points against adding sshd:

  • It introduces some security and management considerations
How do you deal with sshd vulnerabilities?
Who patches them? How? How quickly?
Have you considered that without considering a security model, it may be a vector into your swarm?
How do you manage ssh's config?
How do you keep the keys/password safe?
Is that key unique to an instance, or to other things you've made? In theory, or in practice?
Are those considerations worth it?, particularly given...
  • common "but I like it for X" tend to not really need it. Consider:
    • getting a shell inside? (for image/config changes) You can do that like
      docker exec -i bash
      (seems to be based on the earlier
      nsenter
      ), and for some purposes
      docker attach
    • restarting services?
      • you can often use signals
      • in one-process microservices, restarting the container is the same
      • if multi-process miroservices you often want a monitor tool inside anyway
    • for backup? - Your data should be on a docker volume or mounted, meaning you can usually back that up (only rarely do you need to run a dump tool from inside, and again, that can probably be done via docker exec)
    • to read logs? - If they are important, you should put them on a volume (see anonymous volumes for one solution), or use network logging (which in a swarm you may want/have anyway)


Points for adding sshd

  • during some development
because sometimes the exact shell environment given by a ssh login better reflects what the main process would see (than a docker-exec'd bash would)
nsenter and docker exec require login and the relevant permissions on the docker host, ssh in the container does not
  • dockerizing a CLI (or even GUI) program for "I use this as a VM for isolating dependencies, and people's storage" reasons
take a look at LXC, though
  • you can give people access to something isolated like that without having to be given what amounts to docker-admin-ish permissions on the host
  • You don't have access to the docker host OS (to run docker exec and such)
(same thing from the user side, really)



More on tags

Going public

Build systems

Practical / semi-sorted

docker security

GUI apps in docker

X

VNC, RDP

sound in dockerized apps

CUDA in dockerized apps

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Operating docker

run

automation

compose

fancier than compose

SELinux and docker

Semi-sorted

Where does the host store things?

Better known as "argh how do I make it not fill up my system drive?"


Linux:

  • Most things are stored in /var/lib/docker
most of the size will be in
the image filesystem laters (aufs)
data volumes
If you move the whole thing, and then symlink to it, stop the daemon around doing that, or it'll get confused

"Cannot connect to the Docker daemon. Is the docker daemon running on this host?" or "Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock"

Typically means you don't have permissions to
/var/run/docker.sock
.

Which is often owned by root:docker, so while sudo works, the cleaner solution is to add the user (that you want to do this management) to the docker group.


"not found: manifest unknown: manifest unknown"

Usually means there is no :latest tag?


There are decent arguments that once you have multiple targets, you should probably avoid a :latest so that it doesn't refer to an arbitrary choice.

Say, the Dockerfile example above example used to have

FROM phusion/baseimage

which has a :latest implied (that's how tagging works)

...but once it covered more ground, it made a lot more sense to have e.g.

phusion/baseimage:master-386
phusion/baseimage:master-arm
phusion/baseimage:master-arm64
phusion/baseimage:master-ppc64le
phusion/baseimage:master-ppc64be

and not use :latest to refer to just an arbitrary one of them.



The aufs storage-driver is no longer supported

Docker for windows

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

As far as I can tell (and yes, this gets confusing)...


Originally, 'docker on windows' meant "Sure you can run linux containers on windows:

  • buy windows Pro or Server because you'll need Hyper-V
  • install VirtualBox on windows
  • run a recent linux in that
  • and install docker in that"

After some time, tooling was introduced so that the last three steps would be done for you.


Since the VM overhead appears only once, instances within it work as you'ld expect, and you don't need two servers to run windows+linux stuff. If you build systems with parts linux, parts windows, that are tightly integrated, you might actually get some upsides (say, lower latency) over doing that on two hosts.

...so it's quite a valid solution to a number of use cases.


The confusion only started when MS later said "Now we can run docker containers natively" and 'does not require VirtualBox'.

What they meant was "linux had a good idea, we have added similar in-kernel isolation in windows, so now we can also do windows-only containers".

They weren't really clear about that, but the result is useful, and valid because for the same 'run things on the same metal' implication as before. All fine.


The confusing part is that they called it docker.

Because after typing
docker run
, there is zero overlap in what happens, what makes docker linux and docker windows actually run.


To be a little more complete, MS effectively extended the situation from one or two to roughly six distinct concepts:

  • running linux containers on linux (= what docker meant initially)
  • running linux containers on windows within a VM (= typical if you want the combination)
  • running linux containers on windows natively (MS have said explicitly this won't happen)
No, WSL cannot do that - it's a translation layer (basically reverse wine) specifically without any linux kernel code. While intended to be thinner than wine, e.g. the filesystem performance is noticeably worse.
WSL2 is closer to linux, but not to containers, because it's actually a VM running a modified linux kernel (so at best it's (fairly well provisioned) VM instead of containers, but the integration and provisioning differences matter. It seems a less obvious choice for servers / hosting, though neat for desktop and dev use.)
  • running windows containers on windows (= "docker windows")
  • running windows containers on linux via a windows VM (you could, but I don't suspect many people do this)
  • running windows containers on linux natively (won't happen)


So in the end, the only real similarities between docker linux and docker windows is

building from Dockerfiles (the ability to do so, anyway - the syntax and details will differ)
the
docker
command (tooling was updated to deal with the mix)


It's confusing (see tech forums full of questions) to the point it seems like bad marketing.

And that's not like MS, so people wondered what their long term plan is here.

It's not that MS doesn't understand the concept or benefits of an lightweight secure VM - they implemented something similar.

It's not that MS are bandwagoning for some open-source street cred (again).

It's probably that wanted to be some sort of player in a market that might otherwise exclude them.


Various people have gone on that it's not only not bad marketing, it's actually cleverly surreptitious marketing: Making people think they are not committing to a specific tech -- while actually leading them to do so:

That is, docker linux was a "one thing will runs everywhere that you can install docker".

Docker windows+linux is not that in general, but kinda-maybe on windows.

If you get people to consider docker as a mixed-OS thing ("look at the tooling managing both!"), then you can squint and say windows server does it better (who cares if that's because they exclude the other option).

If you get people to see it as a windows-based application packager, then the next server your business buys may be a licensed windows server for "you never know, it will run all these hip dockery things" reasons.

Unsorted