Container and docker notes

From Helpful
(Redirected from Docker notes)
Jump to navigation Jump to search

Notes related to (mostly whole-computer) virtualization, emulation and simulation.

Virtualization, emulation, simulation · Docker notes · Qemu notes

Containers intro

Containers as opposed to things like it

Containers are not emulation, they are not VMs, they are not not even a hypervisor style VM.


A single container represents a (often small) collections of processes, that runs in the host linux kernel just like any other processes, yet which just happen to be isolated from kernel and other containers in all ways that matter to them being independent, and not trample on other things (and in theory security concerns but there are some footnotes).


Compared to stuff like it

Containers are more lightweight than things like VMs, mostly because

a classical VM must virtualize the hardware (while, roughly speaking, docker virtualizes only the OS)
(so) a classical VM requires you to exclusively reserve resources like RAM, and disk space, before starting, and this cannot be changed while running.
In docker you're being allocated RAM and disk from the host OS as you go -- with configurable limits
a classical VM has to boot that OS. Docker nodes not, it's immediately ready to start the processes it should run.
which makes it a little harder to limit the resource use for a user/program - an upside and a downside.


For some more background and comparison, see linux containers.


Containers work since roughly linux kernel 3.13 (not before, because it leans heavily on recentish kernel features like namespaces. Also cgroups are relevant).


There are comparable things, e.g. SmartOS had been doing this before Linux, though in that case for reasons of operational efficiency and security, whereas docker seems more focused at deploying microservice style things (one or a few processes).


How to see containers

What does docker add to the concept of linux containers?

You can see containers as just the thing inside, the thing that should run.

...which barely addresses

how that container came to exist,
how that container relates to the outside - storage, networking, etc.


There are a number of answers to both of those.

  • Docker is one of them.
  • Kubernetes is another, which adds more layers of abstraction that makes more management sense in larger setups (but is also more complex).
  • LXC is another, which has an entirely different take, trying to have the inside look mostly like a full OS, something that most people using docker specifically avoid. Not because it's bad, but because it's a different use and purpose.


For a long time, docker it was the main way to make containers, and if you had no grander plans, it was just another command or two from also running them.

So its stack defined the way things were typically done.

Today, things are... more flexible and more confusing.

What are containers useful for?

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Depends on who you ask. Angles include:


Portability

The isolation from the host means the container will run the same everywhere, regardless of your linux variant, OS's libraries, hardware, and such.
It will run identically in any place that it will run at all. Which can be useful, for multiple reasons.
(Like a VM does, but more lightweight than a VM)
(Like self-contained applications, like portable software or app images or other names like that, but but in a perhaps cleaner way)


Stable development environment, optionally automated deployment

As an extension of the above, you can code in a way that reduces the "well, it builds/works on my host, why not for you/on the servers?" issues.
Yes, there are other ways to isolate a build environment, or describe one fully enough.
No, docker will not do that by itself - an in fact doing this properly is entirely on you and good habits - but it makes it a bunch simpler.
If you set up triggers: testing, building an image, and deploying it on a dev server, this is easier to automate if it's all done the same way, and less likely to trip over random stuff
(often based on a push to a code versioning system, because that's a convenient way of doing it. But also because it is not a software build system, e.g. changes in code do not invalidate the build cache).
Easy and fast container startup is a nice detail to automated testing of a new version, both of the unit/regression sort, as of some amount of larger deployment. But also valid on just a single host.
can make debugging "but it works-for-me" issues easier



Large-scale deployment

In clusters and swarms it's very handy to have exactly reproducable software environments.
With classical clusters you end up with relatively fine management of each node, and care to keep them consistent.
With containers you basically don't have to think about that at all (beyond the communication, but that's always going to be there).
and because we've separated system environment from app environment, admins can now update the host OS without having to worry, every time, whether that might break some of or all the app due to some random library dependency.


Sane admin on multitenant servers

If you have servers with multiple users with conflicting needs in terms of installed packages, you as an admin will have a better time if you just give them each their own docker container.


Separating apps to relieve dependency hell

A long time ago, executables were often monolithic things that controlled everything - but was too close to the hardware. Modern systems have applications live in an ecosystem of tens or hundreds of libraries. The larger said ecosystem, and the more applications a single ecosystem supports, the more friction and fragility there is between versions for different apps, and harder to administer, e.g. more chance for "I tried to update X and now everything is broken in a way I can neither understand or revert".
this is an ongoing argument -- e.g. one side says that trying to do that is the only way you can ever hope to do it well. Say, linux's sofiles were historically better at window's DLLs, and was better at allowing more installed versions. Yet we're collectively forgetting how to do versioning well
so the easy fix is to reinvent app images: isolated dependencies not only from the OS, but also from other apps.
And docker (or VMs) is one option to keep all the other idiots away from your software environment. Docker is arguably overkill, and can be more bothersome (with the details of breaking out X, audio, the GPU, and such) - you may wish to look at snap instead (although I'm still not sure what to think about it)

When is it less useful?

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


When there is no reason to use it.

in that adding complexity or dependencies without reason is never a great idea
in that using it to distribute a single system is, by nature,
harder to design well
harder to debug,
harder to secure well (how to do it takes some re-learning, changes in thinking, e.g. least-privilege access),
can be harder to admin (e.g. restorable backups, monitoring)
if there are clear benefits, great. If there are not clear other benefit, this is only time cost


If you think it's the hip scalability band-aid

Actual efficiency and scalability, are properties of your design, not of the tool you use to implement that design
Any container/VM/management makes the deployment step potentially easier, none guarantee it


If you think it fixes dependency management

it fixes the "this combination of installations is impossible to satisfy issue" (by being app images), but addresses none of the other issues


If security is your main interest

sure you get isolation by default, and that's several good steps just for free - but security was never the primary goal, and there are a handful footnotes you must know
VMs are arguably better there, at least until security becomes a central focus for containers and docker
whether security updates are easier or harder to do around docker depends entirely on whether you thought about it before you started
whether lots of containers that must trust each other is better or worse depends entirely on your design and implementation


When you stop at docker

Some potential upsides, like automated setup and provisioning, aren't part of docker, they are part of what you wrap around it


Arguables and criticism

Dockerfiles move package management around

People use dockerfiles as a flexible way to install software.
As some people put it, "docker is to apt what apt is to tar." Which I think it supposed to highlight an improvement, but this is a double-edged thing.
nice: you can describe a well controlled and immediately buildable environment in one shortish dockerfile
but: the fact that docker's build system is very basic (even simpler than the fifty-year-old Make system) lets people be particularly bad at at guaranteeing reproducible builds. Not docker's fault, but it makes it more likely the ecosystem becomes a hot mess
but: most dependencies, and dependency problems, are entirely external to docker
say, if you really use apt inside, then either you pin versions and know those will disappear at most a few years later, or you don't (exceeedlignly common) and you hope there is some way that will resolve. Spoilers: that will break too, in an only slightly longer timespan.
(The area I happen to work in tends to be worse than average for this -- but I found that most dockerfiles will not build, most commonly because a few years later later there just is no solution for the same package tree, or the solution is actually incompatible with the actual software, so even if they build fine they break at execution time. I've also found plenty of binary builds that fail, seemingly because it's a lot easier to do automatic builds than to actually test them properly.)



The cynical view

Docker is arguably part of collectively totally giving up on the the idea that a single package system can ever work - and reinvents app images instead (and not even well), so now we have 500MB~2GB app images of everything to wrap one 100KB executable (well, that describes snap better, but okay)
we are not great at making those builds stable over time
say, none of the package stuff is part of docker or related to it at all, so you have no guarantees and no control. Repositories change? Your docker build just stops working
dockerfile must be maintained, which happens only for some of the most used images. Most everything else just breaks after a few years.
and relying on binary images makes it even harder to do security updates than just installing the thing directly.
and while these base images could overlap in theory, they generally don't in practice. This isn't the thing that makes you go broke in server fees, but still.
And don't get me started on the unholy mess that is trying to get GPU stuff to work, with dockerfiles often breaking on software's version fragility (or broken URLs, because there was no package), or breaking because people managed to make them binary builds that fail on other GPUs - plus we have little control over the proprietary nvidia stuff bolted onto docker so if five years later it won't run, it may just never run again.


...all because we never solved half the problems we had, we just stirred them a bit.

So in multiple ways we are recreating dependency hell while actively pretending to solve it. Sigh.

Single-purpose

Good and bad ideas

What's this about runtimes?

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


In this context, runtimes are things you can run containers in.


An OCI (Open Container Initiative) compliant container runtime is one that can run an OCI compliant image.

OCI compliant runtimes include

runC / containerd
runC is mostly the seccomp, selinux, or apparmor for syscall filtering
containerd
wraps runC with a network API (containerd does a fork-exec of runC)
manages container lifecycle (image transfer/pull/push, supervision, networking, etc)
and is what makes this setup OCI compliant (verify)
docker daemon
dockerd
cri-o (as in Kubernetes, so google)
runs OCI compliant images (the -O stands for OCI)
CRI ('container runtime interface') in general was a move within kubernetes to allow mixing runtimes - it previously supported more than one but only one of them in any installation. CRI(-O?(verify)) means you can mix.
defaults to runc runtime(verify)
People made a kerfuffle when "Kubernetes dropped docker support" in 2020 but it just meant "we no longer use dockerd as a runtime, because running the same images via CRI-O instead makes things simpler" (it also means they can drop dockershim, the wrapper they had to keep around the ever-changing docker details)


rkt

CoreOS
on top of runC
not OCI?


gVisor - runC replacement

google
designed for more security - basically kernel API implemented in userspace?
https://www.usenix.org/system/files/hotcloud19-paper-young.pdf


nabla containers

IBM
doesn't run linux containers?
similar security approach to gVisor


kata containers

intel
contain kernel, so more of a VM, and slower startup
upsides are easier use in certain virtualised environments? (verify)
OCI compliant



podman, buildx, kaniko


https://medium.com/@alenkacz/whats-the-difference-between-runc-containerd-docker-3fc8f79d4d6e

https://www.ianlewis.org/en/container-runtimes-part-1-introduction-container-r

https://gist.github.com/miguelmota/8082507590d55c400c5dc520a43e14a1


Some container concepts

An image file is a completely environment that can be instantiated (run).

It amounts to a snapshot of a filesystem.
images are often layered on top of other images
which makes it easier to build (you can cache lower layers), and makes it easier to version-control each layer
and since layers are references (to things you also have) rather than copies, it makes heavily layered images smaller
For example's sake you can rely on the default fetching of image files from Docker Hub.
(you can create image files yourself, and will eventually want to)


A container is an instance of an image file - either running, or stopped and not cleaned up yet.

Note that the changes to files from the image are not written to that image. They are copy-on-write to container-specific state.

What docker adds to containers

Some docker concepts

Images have IDs, containers have IDs.

They show up in most tools as large hexadecimal strings which (actually a smallish part of a larger sha256 hash, see --no-trunc for the full thing).
Note that you only need to type as many characters to make it unique within what you have (which may frequently be one or two characters)


IDs aren't ideal for larger-scale management, so

  • for images you often want aliases to images, e.g. those used by repositories
  • For containers there are names. The automatically generated ones (look like unruffled_curran) are meant to make it easier for humans to communicate about them (than a hexadecimal numbers are).
You can give them your own meaningful names -- but note they must be unique (so at scale you need some scheme)


#More on tags below, which matters when you make your own repository.



A registry is a particular site that hosts images - defaults to docker hub, and is the place where names resolve, letting you do:

docker pull bitnami/rabbitmq



Introduction by example

Install docker

Doing so via package management often does 90% of the setup work.


You may want to give a specific users extra rights, so that you don't need to do things as root (or via sudo). ...but for a quick first playing around the latter's fine, and I'm doing it in the examples below.


Instantiating things

The following is a one-liner test of it functioning at all -- and now how you would really create things in practice. More on that later.

root@host# docker run -i -t ubuntu /bin/bash
root@3b66e09b0fa2:/# ps faux
root         1  0.0  0.0  18164  1984 ?        Ss   12:09   0:00 bash
root        15  0.0  0.0  15560  1104 ?        R+   12:11   0:00 ps faux

What happened:

  • it found an image called ubuntu (ubuntu:latest, actually -- more on that in tags section)
if this was your first run, it downloaded that from docker hub first
  • instantiated the image to a new container
  • ran /bin/bash as its entry point (main process)
and that 3b66e09b0fa2 is the hostname (don't worry about that for now)
  • we manually ran ps within it
...to demonstrate that the only processes inside right then are that shell and that command


Notes so far:

  • The entry point is the main process, and also determines the container lifetime: once that quits, the container stops
99% of the time, the entry point is not bash, but an independent, long-running command -- e.g.
something that is the parent of everything else in the container, and watches that those proceses are fine
or just the one process we wanted to run in there
exactly what you want to put in there, and why, is a discussion of its own ()
in this case we ran bash primarily because you can look around inside and convince yourself it's entirely separate from the host you're running it on
also note that -i -t, for 'interactive' and 'allocte a tty' are only necessary because we want an interactive shell, which is not typical. typing just Template:Ininecode would also work -- and quit immediately, because it's only meant as something for you to extend.
  • the full container id is long, but it's rarely even visible -- most places where docker prints one or wants one needs only a few bytes (docker usually shows six bytes, in twelve hex characters), because that's almost always unique
  • by default, the container id also becomes its hostname

getting a shell inside

Assuming your image has a /bin/bash, you can

docker exec -it containerid_or_name /bin/bash


Status and cleanup

Containers

See running container instances on the host:

  • docker ps to list running containers
  • docker ps -a to list running and old containers

Assuming that last example is still running in a terminal somewhere, docker ps would show something like:

CONTAINER ID     IMAGE            COMMAND       CREATED          STATUS           PORTS      NAMES
3b66e09b0fa2     ubuntu:latest    "bash"        9 seconds ago    Up 8 seconds                drunk_mestorf


Once you're done with a container

In testing, you often run containers in a way that means their state sticks around when they stop, in case there is something you want to figure out (in production, you may often want them to be stateless: use --rm (remove state after stopping), and log important stuff elsewhere).

So you would want to clean up stopped containers, either specifically

docker rm containerid

or all stopped containers:

docker container prune


Further shell-fu:

  • Stop all containers
docker stop $(docker ps -a -q) 


Images

See existing images on the host:

  • docker images lists present images
  • docker images -a includes non-tagged ones,
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
<none>                <none>              eda5290e904b        2 months ago        70.5MB
ubuntu                bionic-20190515     7698f282e524        3 months ago        69.9MB


If you're done with images:

  • remove image by id (or tag), e.g. based on an entry from docker images.
docker rmi ids
It'll refuse when something currently depends on it


Building images builds a lot of intermediates that will be cached, so once you start building images, you probably care that you can clean all images that have not been given tags ("dangling images")

docker image prune


Further shell-fu:

  • bulk remove images via tag wildcard: consider things like
docker rmi $(docker images --filter=reference="user/spider:22*" -q)
note that not thinking may mean you do docker rmi $(docker images -a -q), i.e. remove all (unused) images

Wider

You may like docker system prune, which is roughly equivalent[1] to all of:

docker container prune
docker image prune (with optional --all, see above)

and also

docker network prune
docker volume prune (if you add --volumes)

Image building example

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Dockerfile example

A Dockerfile is a sequence of actions. Each action creates a new layer.

There is obviously a lot more to Dockerfiles and building, but to give an idea of how unassuming builds could be, consider:


FROM phusion/baseimage:master-386
# Which is ubuntu-based, see https://github.com/phusion/baseimage-docker

RUN apt-get update                                                                                                                  

RUN apt-get install -y apache2 libapache2-mod-php

ADD my_software.tgz /opt
RUN echo 'export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/opt/my_software/lib' >> /etc/profile
RUN echo 'export PATH=${PATH}:/opt/my_software/bin' >> /etc/profile


# The point of this example is software intended to be interactive via SSH -- WHICH IS ATYPICAL 
# Enable SSH  (specific to phusion/baseimage) 
RUN rm -f /etc/service/sshd/down
# generate SSH keys (so it'll change every build. leaving this out would do it each boot
RUN /etc/my_init.d/00_regen_ssh_host_keys.sh

# baseimage's init
CMD ["/sbin/my_init"]

RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Given

  • you have a subdirectory called example_dir/
  • the above content is in a file called example_dir/Dockerfile
  • then you can build like docker build example_dir -t example_imagename

Then if successful it's runnable e.g. like docker run -t -i example_name:latest /bin/bash

automatic restarting

Compose

On container communication

Networking types

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Notes:

  • the --add-host argument to docker run can be useful to augment the /etc/hosts inside.


See also:


bridge

When you do not specify --network, a container is connected to the default bridge

This looks something like:

docker0 on the host side
with a subnet like 172.17/16 (by default, can be changed)
each container gets
a host-side interface (like veth8d404d9)
bridged onto that docker0
a container-side interface (like eth0)
an IP on its subnet (via docker's internal DHCP(verify))


This means all containers can communicate, but only by IP. Also to the host-side interface, and assuming IP forwarding is on also out of the host, because 172.17.0.1 is the default gateway inside.


While it's nice to have a default that just works, the default bridge is a now considered legacy and is not recommended, for mostly pragmatic reasons.

User-defined bridge networks are mostly the same idea, but referring to a specific network (by name). This lets you create one for each set of related containers.

(Containers can be members of several)

It also adds a built-in DNS server (at 127.0.0.11), letting containers resolve each other by name, or alias, relieving you of a bunch of manual configuration / hardcoding burden.

docker network create docker network rm docker network disconnect docker network connect ...though


host
container gets the hosts's stack, and interfaces
port mapping (-p, -P) does not apply
hostname is set from the host
has its own UTS namespace so you can change its hostname without affecting the host
not very secure, so generally not what you want for isolation
but better performance than bridge
so can be useful for local apps
not supported on osx or windows
overlay
distributed network among docker-daemon hosts - swarms
containers can be members of several


macvlan

https://docs.docker.com/network/macvlan/


container
none

No networking beyond loopback

https://docs.docker.com/network/none/


Exposing to host

On name resolution

Resource management

How does storage work?

Container state

Bind mounts, data volumes

On permissions

VOLUME in Dockerfiles

Databases

Limiting CPU, memory

On image building

There are two basic ways to build an image:

  • manually: start with something close to what you want, make the changes you want
saving this container = saving all filesystem changes within it
good for a one-time quick fix, less ideal for some habits you'll need at larger scale
  • automate: write a dockerfile
docker build creates an image from a dockerfile - basically from a series of commands
faster to transfer, since images are cached
https://docs.docker.com/engine/reference/builder/


The hard way: manually

Say we want to build on the ubuntu image to make a web server image

root@host# docker run -i -t ubuntu /bin/bash
root@3b66e09b0fa2:/# apt-get install apache2
# ...and whatever else

To verify the difference to the original image, we can do (note the hash id will be different for you):

root@host# docker diff 3b66e

Which'll print ~1200 filenames. Fine, let's roll that into a new image:

root@host# docker commit 3b66e my_ubuntu_apache
cbbb61030ba24dda25f2cb27af41cc7a96a5ad9d23ef2bb500e9eaa2e16aa44d

which now exists as an image:

root@host# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
my_ubuntu_apache    latest              cbbb61030ba2        About a minute ago   202.7 MB
ubuntu              latest              b7cf8f0d9e82        2 weeks ago          188.3 MB

Notes:

  • commits are to your local registry. (Don't worry, you won't accidentally write on docker hub (or elsewhere) until you are registered, you are linked, and do a docker push)
  • a commit is actually more like a snapshot / layer
  • if you want things reusable and configurable, there are various good practices about how configuration and environment is best handled. You'll read up when you need it.


You can do it this way, but for complex and/or reproducible things it's nicer to automate this:


The usual way: Dockerfiles

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Dockerfiles are a recipe-like sequence of operations, with cacheing of what it knows it's already done (but no dependencies, so not so much like makefiles).

Dockerfiles have a context -- basically acomplete set of files you refer to (The point of contexts is basically that docker is networked, meaning the daemon need not be running on the same host as the client, so this set of file needs to be well defined so they can be transferrable somehow)


You build an image either like

docker build directory

OR

docker build URL

When you give it a directory, the context is the Dockerfile and other files in that directory.

In the URL case it can be

  • a plain text file as Dockerfile (there is no further context)
  • a pre-packaged tarball context (fetched, uncompressed, then fed as a directory), so reducing to the directory case
  • a Git repository - basically fetched and then fed in as a directory, the Dockerfile being part of that


The only required part is the Dockerfile, futher contents just make builds easier (and self-contained things, rather than something you'd have to curl/wget from a place that needs to be available).

When using files from the context, use relative paths.



multi-stage builds

You can refer to earlier stages by number

FROM fork:v2

# compiley stuff here

FROM spoon
COPY --from=0 /foo/bar


It may be clearer to name them, though:

FROM fork:v2 AS fork

# compiley stuff here

FROM spoon
COPY --from=fork /foo/bar /foo


It's an alternative to another approach people have asked for, namely to allow dockerfiles to include other dockerfiles.

This is conceptually simpler in terms of building related images, though would make it a more complex build system that people would probably make a mess of.



https://docs.docker.com/develop/develop-images/multistage-build/

-->

Practical notes

general habits and tips

Some common bits

DEBIAN_FRONTEND=noninteractive

...On debian / ubuntu. The options to this are readline, dialog/whiptail, or noninteractive, and the last is interesting for automatic installs in that it will never pause on a question.

As-is it will often choose a hopefully-sensible default where possible (e.g. UTC timezone).


If you want non-defaults, or things that are site-specific, or answering the occasional thing that really is a freeer-form field, you would use the above and store these values store them in the debconf database. Look around for guides with commands involving debconf-set-selections


see man debconf and man debconf-set-selections for more details


apt --no-install-recommends

tl;dr: Recommended packages are the things you probably want installed too. By saying "no, I'll do that explicitly", you can often remove a few things and slim down the image somewhat.


Recommended packages are those that you often probably want too. Weaker than Depends, stronger than Suggests. It's a gray area, but recommendations are the things most people probably actually want, for a thing to live up to expectations.

By default, recommended packages are installed. Disabling that effectively treats them as suggestions.


Examples:

gcc recommends libc-dev, and suggests things like autoconf, augomake, flex, bison,
a media player might recommend the ability to play some less-common media sources, and suggest some rarer ones,
TeX may recommend a bunch of modern packages that most people use now,
a -dev package may well recommend the according -doc package,
things in general may recommend or suggest some contrib stuff.

To get more of an idea, try things like:

apt-cache depends gcc

Minimal images

Build systems?

Docker is something of a build system itself, in that you can generate consistent build environments, and a pretty succinct one at that.


At the same time, dockerfiles usually rely on other build systems, and don't think about versioning, so a lot of dockerfiles will not build years after.

Dammit, people, you just re-invented DLL hell while pretending to fix it.


Sooo. Mixed.

Complex with and without: Versions and config

The microservice idea

"avoid sshd"

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

There is nothing inherently considered-evil about sshd.

Yet there are still good reasons to avoid it. It comes down to cost/benefit.


Points against adding sshd:

  • It introduces some security and management considerations
How do you deal with sshd vulnerabilities?
Who patches them? How? How quickly?
Have you considered that without considering a security model, it may be a vector into your swarm?
How do you manage ssh's config?
How do you keep the keys/password safe?
Is that key unique to an instance, or to other things you've made? In theory, or in practice?
Are those considerations worth it?, particularly given...
  • common "but I like it for X" tend to not really need it. Consider:
    • getting a shell inside? (for image/config changes) You can usually do that by docker exec-ing a bash (seems to be based on the earlier nsenter), and for some purposes docker attach
    • restarting services?
      • you can often use signals
      • in one-process microservices, restarting the container is the same
      • if multi-process miroservices you often want a monitor tool inside anyway
    • for backup? - Your data should be on a docker volume or mounted, meaning you can usually back that up (only rarely do you need to run a dump tool from inside, and again, that can probably be done via docker exec)
    • to read logs? - If they are important, you should put them on a volume (see anonymous volumes for one solution), or use network logging (which in a swarm you may want/have anyway)


Points for adding sshd

  • during some development
because sometimes the exact shell environment given by a ssh login better reflects what the main process would see (than a docker-exec'd bash would)
nsenter and docker exec require login and the relevant permissions on the docker host, ssh in the container does not
  • dockerizing a CLI (or even GUI) program for "I use this as a VM for isolating dependencies, and people's storage" reasons
take a look at LXC, though
  • you can give people access to something isolated like that without having to be given what amounts to docker-admin-ish permissions on the host
  • You don't have access to the docker host OS (to run docker exec and such)
(same thing from the user side, really)



More on tags

Going public

Build systems

Practical / semi-sorted

docker security

GUI apps in docker

X

VNC, RDP

sound in dockerized apps

CUDA in dockerized apps

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Operating docker

run

automation

compose

fancier than compose

SELinux and docker

Semi-sorted

Where the "single process" thing does have a point

Where does the host store things?

Better known as "argh how do I make it not fill up my system drive?"


Linux:

  • Most things are stored in /var/lib/docker
most of the size will be in
the image filesystem laters (aufs)
data volumes
If you move the whole thing, and then symlink to it, stop the daemon around doing that, or it'll get confused

"Cannot connect to the Docker daemon. Is the docker daemon running on this host?" or "Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock"

Typically means you don't have permissions to /var/run/docker.sock.

Which is often owned by root:docker, so while sudo works, the cleaner solution is to add the user (that you want to do this management) to the docker group.


executable file not found in $PATH

You probably tried a docker exec with a shell.


If it says exec "-i": executable file not found in $PATH: unknown

The arguments to exec are a little picker about order than many other subcommands are. It's

docker exec [OPTIONS] CONTAINER COMMAND [ARG...]

so e.g.

docker exec 0123456789ab -i -t /bin/bash

is wrong and should be:

docker exec -i -t 0123456789ab /bin/bash

(basically, you told it the command to execute inside is "-i -t /bin/bash")


If it says exec: "/bin/bash": stat /bin/bash: no such file or directory: unknown

You specified an absolute path that does not exist inside. It seems that bash isn't installed, try /bin/sh


If it says exec: "bash": executable file not found in $PATH:

You specified an name rather than an absolute path, so it's trying to resolve it via the PATH.

In the case of bash: since PATH probably includes /bin, this would usually mean that bash isn't installed.

Try sh instead. (And maybe specify /bin/sh, to eliminate the possibility of PATH not including /bin)

"not found: manifest unknown: manifest unknown"

Usually means there is no :latest tag?


There are decent arguments that once you have multiple targets, you should probably avoid a :latest so that it doesn't refer to an arbitrary choice.

Say, the Dockerfile example above example used to have

FROM phusion/baseimage

which has a :latest implied (that's how tagging works)

...but once it covered more ground, it made a lot more sense to have e.g.

phusion/baseimage:master-386
phusion/baseimage:master-arm
phusion/baseimage:master-arm64
phusion/baseimage:master-ppc64le
phusion/baseimage:master-ppc64be

and not use :latest to refer to just an arbitrary one of them.



The aufs storage-driver is no longer supported

Docker for windows

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

As far as I can tell (and yes, this gets confusing)...


Originally, 'docker on windows' meant "Sure you can run linux containers on windows:

  • buy windows Pro or Server because you'll need Hyper-V
  • install VirtualBox on windows
  • run a recent linux in that
  • and install docker in that"

After some time, tooling was introduced so that the last three steps would be done for you, and you could run docker commands from windows's prompt.


The VM overhead appears only once, instances within it work as you'ld expect, and you don't need two servers to run windows+linux stuff. ...so if you build systems that need both linux and windows compoonents, that are tightly integrated, you might actually get some upsides (say, lower latency) over doing that on two hosts, so it's a valid solution to a (probably small) number of use cases.


The confusion only started when MS later said "Now we can run docker containers natively" and 'does not require VirtualBox'.

What they meant was "linux had a good idea, we have added similar in-kernel isolation in windows, so now we can also do windows-only containers".

They were not really clear about that, but the result is useful, and valid because for the same 'run things on the same metal' implication as before.


The confusing part is that they called it docker.

Because after you type docker run, there is zero overlap in what happens, what makes docker linux and docker windows actually run.


To be a little more complete, MS effectively extended the situation from one or two to roughly six distinct concepts:

  • running linux containers on linux (= what docker meant initially)
  • running linux containers on windows within a VM (= typical if you want the combination)
  • running linux containers on windows natively (MS have said explicitly this won't happen)
No, WSL cannot do that - it's a translation layer (basically reverse wine) specifically without any linux kernel code. While intended to be thinner than wine, e.g. the filesystem performance is noticeably worse.
WSL2 is closer to linux, but not to containers, because it's actually a VM running a modified linux kernel (so at best it's (fairly well provisioned) VM instead of containers, but the integration and provisioning differences matter. It seems a less obvious choice for servers / hosting, though neat for desktop and dev use.)
  • running windows containers on windows (= "docker windows")
  • running windows containers on linux via a windows VM (you could, but I don't suspect many people do this)
  • running windows containers on linux natively (won't happen)


So in the end, the only real similarities between docker linux and docker windows is

building from Dockerfiles (the ability to do so, anyway - the syntax and details will differ)
the docker command (tooling was updated to deal with the mix)
other things that relate to the docker product rather than to container tech


It's confusing (see tech forums full of questions) to the point it seems like bad marketing.

And that's not like MS, so people wondered what their long term plan is here.

It's not that MS doesn't understand the concept, or benefits, of an lightweight secure VM. Clearly - they implemented something similar.

It's not that MS are bandwagoning for some open-source street cred again.

It's probably that wanted to be some sort of player in a market that might otherwise exclude them.


Various people have suggested it's not only not bad marketing, it's actually cleverly surreptitious marketing: Making people think they are not committing to a specific tech -- while actually leading them to do so:

That is, docker linux was a "one thing will runs everywhere that you can install docker".

Docker windows+linux means means that's not longer true in general, but kinda-maybe on windows.

If you get people to consider docker as a mixed-OS thing ("look at the tooling managing both!"), then you can squint and say windows server does it better (who cares if that's because they exclude the other option).

If you get people to see it as a windows-based application packager, then the next server your business buys may be a licensed windows server for "you never know, it will run all these hip dockery things" reasons.

"Docker Desktop"

Unsorted