Difference between revisions of "Docker notes"

From Helpful
Jump to: navigation, search
m
m ("avoid sshd")
(One intermediate revision by the same user not shown)
Line 405: Line 405:
 
You build an image like:
 
You build an image like:
 
  docker build .
 
  docker build .
This picks up a Dockerfile and a context, where that context contains the files you may be including.
+
This picks up  
 +
* a Dockerfile
 +
** defaults to pick up a file called Dockerfile from the directory you run {{inlinecode|docker build}} in{{verify}}
 +
** text fed into stdin (in which case there is no context{{verify}})
  
 +
* a context, which is either
 +
** a directory containing the files you may be referencing (with relative paths)
 +
** an URL, one of:
 +
*** Git repository (fetched, then fed as a directory)
 +
*** pre-packaged tarball context (uncompressed, then fed as a directory)
 +
*** plain text file (there is no context)
  
  
Line 439: Line 448:
  
  
 
+
https://docs.docker.com/engine/reference/commandline/build/
  
 
-->
 
-->
Line 537: Line 546:
  
 
-->
 
-->
 
  
 
===More on tags===
 
===More on tags===
Line 701: Line 709:
 
It's unnecessary for ''some'' of the cases you're currently thinking of. Consider for example:
 
It's unnecessary for ''some'' of the cases you're currently thinking of. Consider for example:
  
* entering a running container: (e.g. for image/config changes): {{inlinecode|docker exec -i bash}} gets you in (seems to be based on the earlier {{inlinecode|nsenter}}), and for some purposes {{inlinecode|docker attach}}
+
* entering a running container: (e.g. for image/config changes):
 +
: {{inlinecode|docker exec -i bash}} gets you in (seems to be based on the earlier {{inlinecode|nsenter}}), and for some purposes {{inlinecode|docker attach}}
  
 
* restart services - You can usually do that with signals (via docker exec)
 
* restart services - You can usually do that with signals (via docker exec)

Revision as of 17:35, 15 August 2019

Notes related to (mostly whole-computer) virtualization, emulation and simulation.

Some overview · Docker notes · Qemu notes · Some overview ·

Intro

What?

Not a separated machine, but processes within a host linux kernel (since approx 3.13) -- that happen to be isolated in all the ways that matter.

They are more lightweight than running a classical VM

in part because they virtualize the OS (interface), not the hardware
in part because of practical details, e.g. how the images are created/used
in part because persistent storage is intentionally separated


Actually, the isolation is mostly recent kernel features; Docker is one specific product/toolkit on top that make it practical to actually use.

Docker has its own take on it. For some more background and comparison, see Virtualization,_emulation,_simulation#Linux_containers.

There are comparable things, e.g. SmartOS had been doing this before Linux, though more for reasons of operational efficiency and security, whereas docker focuses more on the microservice angle.

What's it useful for?

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Depends on who you ask.


Angles include:


Separating apps things to relieve dependency hell

Where historically, executables were often more monolithic things that controlled everything, modern systems tend to have applications live in an ecosystem of libraries and more, which keep working by merit of that environment staying sane.

The larger said ecosystem, the more friction and fragility there is, and harder to administer, e.g. more chance for "oh god I tried to update X and now everything is in a weird sorta-broken state that I don't understand.".


Keeping things mixed works out as a good solution for the OS itself, largely because there are people putting time into keeping that well defined.

Apps on the other hand do what they want, and are sometimes best kept isolated from not only each other but also from the OS, basically whenever that ends up simplifying the management.


Note that docker's a little overkill for just this - there's other ways to do this, some simpler, some newer.


Portability

The just-mentioned separation also means each container'd app will run the same regardless of your hardware or OS. Same as a VM, really.


Useful layer of abstraction for software packages

The tech for OS containers had existed in a usable state for over decade. Docker just made them a lot easier to actually use.

Or, as some other people put it, "docker is to apt what apt is to tar."

On top of that there are things like Docker Hub, repositories to share images people build, which makes it a software-service repository.

(Which confuses the above analogy in that it's like apt for docker. Also docker images are tarballs. Am I helping yet? :) )



Development and automating deployment

It's not too hard to set up so that a code push automatically triggers: testing, building an image, and starting it on a dev server.

Easy and fast container startup is pretty convenient to automate testing of a new version, both of the unit/regression sort, as of some amount of larger deployment. But also valid on just a single host.

docker diff can sometimes be rather useful in debugging works-for-me issues


Large-scale deployment

In clusters it's very handy to not have exactly reproducable environments, without fine management of each node and with minimal overhead.

When is it less useful?

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
  • When there is no justification reason to use it.
because you'ld be adding complexity without any reason.
and distributed software is, by nature, harder to design and a lot harder to debug than a monolithic setup.
  • if you think it's the scalability band-aid
Docker only makes the deployment step easier, but efficiency and scalability are properties of design
  • takes some relearning on how to do things like least-privilege access, restorable backups, monitoring,
  • if security is your main interest
you get isolation, yes, but it's not the primary goal, and there are footnotes you must know
VMs are arguably better there, at least until security becomes a central focus for docker
  • some potential upsides, like automated setup and provisioning, aren't automatic.
they only really happen when you think about how to do them for your case, and it has to make sense for your use case.

Single-purpose

Security model

Good and bad ideas

Technical Intro

Some concepts

An image file is a completely specified environment that can be instantiated.

It is basically a snapshot of a filesystem.
They are often layered on top of other images
which makes it easier to build, and easier to version-control each layer
and since layers they are references (to things you also have) rather than copies, it makes heavily layered images much smaller
For example's sake you can rely on the default fetching of image files from Docker Hub.
(you can create image files yourself, and will eventually want to)


A container is a instance of an image file - either running, or stopped and not cleaned up yet.

Note that the changes to files from the image are not written to that image. They are copy-on-write to container-specific state.




Images and containers have IDs. They show up in most tools as large hexadecimal strings which (actually a smallish part of a larger (sha256) hash, see --no-trunc for the full thing).

Note that you only need to type as many characters to make it unique within what you have (which may only be one character),


It's still not what you want for larger-scale management, so

  • for images you often want to deal with repository-based aliases to images (which are shorthands that point at an image ID(verify)).
  • For containers there are names. The automatically generated ones (look like unruffled_curran) are meant to make it easier for humans to communicate about them (than a hexadecimal numbers are).
You can give them your own meaningful names -- but note they must be unique (so at scale you need some scheme)


#More on tags below, which matters when you make your own repository.



Introduction by example

Install docker.

Doing so via package management often does 90% of the setup work.


You may want to give a specific users extra rights, so that you don't need to do things as root (or via sudo). (But for a quick first playing around that's fine, and I'm doing it in the examples below)


Instantiating things

root@host# docker run -i -t ubuntu /bin/bash
root@3b66e09b0fa2:/# ps faux
root         1  0.0  0.0  18164  1984 ?        Ss   12:09   0:00 bash
root        15  0.0  0.0  15560  1104 ?        R+   12:11   0:00 ps faux

What happened:

  • it found an image called ubuntu (downloaded it from docker hub if not present yet)
  • instantiated the image to a container, with bash as entry point / main process
  • you manually ran ps within it. (Yes, the only processes inside right then are that shell and that command)


Notes:

  • The entry point is the main process, and also determines the container lifetime.
You would generally use an independent, long-running command. Exactly what is up to you.
In the microservice philosophy often the actual service (sometimes a process monitor)
note that the main process's stdout will go to the logs
It is also valid to see a container as half of an OS, with a stack of software, plus perhaps a ssh daemon. It's often cleaner to avoid that if you can, but it can sometimes make a lot of sense.
note:
docker kill
kills the main process, and thereby the container.
In this example the main process is bash, purely as a "prove it's a separate thing" example and means the container only lives until you log out.
also note that
-i -t
, for 'interactive' and 'allocte a tty' are only necessary because we want an interactive shell, which is not typical
  • If you didn't have the
    ubuntu
    image, then ubuntu:latest would have been downloaded from docker hub.
  • by default, the container id also becomes its hostname
  • the full container id is long. Most places where docker prints one or wants one needs only a few bytes (docker usually shows six bytes, twelve hex characters), because that's almost always unique
  • in many cases, you can also use the name, but note that these need to be unique


Status and cleanup, of images and instances

Status

See existing images on the host:

  • docker images
    lists present images
  • docker images -a
    includes non-named ones,
REPOSITORY            TAG                 IMAGE ID            CREATED             SIZE
<none>                <none>              eda5290e904b        2 months ago        70.5MB
ubuntu                bionic-20190515     7698f282e524        3 months ago        69.9MB


See running container instances on the host:

  • docker ps
    to list running containers
  • docker ps -a
    to list running and old containers
CONTAINER ID     IMAGE            COMMAND       CREATED          STATUS           PORTS      NAMES
3b66e09b0fa2     ubuntu:latest    "bash"        9 seconds ago    Up 8 seconds                drunk_mestorf



Cleanup

If you're done with images:

  • docker rmi ids
    to remove them, e.g. based on
    docker images
    .
It'll refuse when something currently depends on it.
  • docker image prune
    should clean up dangling images (or, with -a, unused)
dangling images are mostly build layers are no longer referred to (no repository and tag)


In default config, container instance state sticks around after the container stops.

This can be very useful for debug during development, but you do eventually want to clean them:

  • docker rm contid


If you don't care about post-mortem debugging, you can start containers with
--rm
to always have it immediately clean up after itself.


On image building

There are two basic ways to build an image:

  • manually: start with something close to what you want, make the changes you want
saving this container = saving all filesystem changes within it
good for a one-time quick fix, less ideal for some habits you'll need at larger scale
  • automate: write a dockerfile
docker build creates an image from a dockerfile - basically from a series of commands
faster to transfer, since images are cached
https://docs.docker.com/engine/reference/builder/


The hard way: manually

Say we want to build on the ubuntu image to make a web server image

root@host# docker run -i -t ubuntu /bin/bash
root@3b66e09b0fa2:/# apt-get install apache2
# ...and whatever else

To verify the difference to the original image, we can do:

root@host@ docker diff 3b66e

Which'll print ~1200 filenames. Fine, let's roll that into a new image

root@host# docker commit 3b66e my_ubuntu_apache
cbbb61030ba24dda25f2cb27af41cc7a96a5ad9d23ef2bb500e9eaa2e16aa44d

and check that it's now known for us:

root@host# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED              VIRTUAL SIZE
my_ubuntu_apache    latest              cbbb61030ba2        About a minute ago   202.7 MB
ubuntu              latest              b7cf8f0d9e82        2 weeks ago          188.3 MB

Notes:

  • commits are to your local repository. You won't accidentally write things on docker hub (or other linked site) until you are registered, you are linked, and do a docker push
actually a commit is more like a snapshot / layer
if you want things reusable and configurable, there are various good practices about how configuration and environment is best handled. You'll read up when you need it.


Okay, say we want to check the new image works - we can quit the old one and continue with the new one

root@3b66e09b0fa2:/# exit
root@host# docker run -it my_ubuntu_apache bash
root@d9bc62e40440:/# apt-get install apache2
Reading package lists... Done
Building dependency tree
Reading state information... Done
apache2 is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

Eventually, we'll want to docker run the container, and we want it to -->


The usual way: Dockerfiles

multi-stage builds

habits and tips

More on tags

The microservice philosophy

"avoid sshd"

Resource management

How does storage work?

How does networking work?

Limiting CPU, memory

Various commands

Starting, stopping

Pratical sides to

docker security

GUI apps in docker

You can host a VNC, RDP, or similar screen if you want it to be and entirely independent instance.


If you're doing it just for the don't-break-my-libraries reasons, then you may want to consider displaying directly to the host's X server. (Also, take a look at snap)

Since X is a networked protocol in the first place, that connection is not very complex - the most you have to worry about is X authentication, and potentially about X extensions.


Keep in mind that sound is a little extra work. As is using your GPU.


microservices in docker

SELinux and docker

Semi-sorted

"Cannot connect to the Docker daemon. Is the docker daemon running on this host?" or "Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock"

Typically means you don't have permissions to
/var/run/docker.sock
.

Which is often owned by root:docker, so while sudo works, the cleaner solution is to add the user (that you want to do this management) to the docker group.



Docker for windows

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

So, this one's confusing. As far as I can tell, so far...


Originally, docker on windows meant "Sure you can run linux containers on windows: Install VirtualBox on windows, run linux in that, and docker in that". This was somewhat awkward, but worked, and potentially quite valid: the VM's overhead appears only once, the instances within it work as you'ld expect, tooling was introduced to manage this from the windows side, meaning you could have linux docker basically as-is on windows (and of course run windows apps (or VMs) on the same metal and OS).

(Though you do need windows Pro or Server to do this, because you'll want Hyper-V which is missing in the basic windows versions)


The confusion came later, when MS said "Now we can run docker containers natively" and 'does not require VirtualBox'.

Because what it basically means is "we have made our own thing that runs only on windows". It primarily points as some in-kernel isolation (that imitates what linux did), which is useful.


The confusing part is that they called it docker. Sure you can build these images with dockerfiles, but that's where the similarity begins and ends.

The way these are run are completely unrelated to linux docker. There is zero technical overlap.


Again, the 'run things on the same metal' implication is useful, which now amounts to having docker for linux in a VM as described earlier, and adding native windows containers alongside.

And it's nice that the tooling was updated to manage both these systems.


But the containers are fundamentally and thoroughly incompatible, so it's misleading to use Docker as an umbella terms when this very introduction included a lot of conditionals about how to run it.

...but the tooling's the same. Well, on one system.


You'll notice we've extended the situation to half a dozen possible distinct concepts:

  • running linux containers on linux (= docker)
  • running linux containers on windows via a VM (= typical if you want the combination)
  • running linux containers on windows natively (MS have basically said this will never happen)
  • running windows containers on windows (= "docker windows")
  • running windows containers on linux natively (no, WSL cannot do that. And it can't really happen unless MS is fully on board with the idea, mainly because licenses)
  • running windows containers on linux via a windows VM (you can, but I don't suspect many people do this)


It's confusing to the point it seems like bad marketing -- or maybe really intentional marketing.


People have raised the question what MS is working towards in the long term.

It's not that MS doesn't understand the concept of an lightweight secure VM.

It's not that MS are bandwagoning for some open-source street cred.

They may just want to make sure they're somehow a player in a type of market that might exclude them.


So many have suggested this it's also a clever but surreptitious marketing strategy: You get people thinking they are not committing to a specific tech -- while leading them to do so:

If you get people to see docker as a mixed-OS thing, mainly due to the tooling, then if you squint you can say say they do it a little better.

Something you can't say if it a case of "windows also does something similar" more so because it'd make it clearer just how little this is about interoperability.


Also, if you get people to see docker as a windows-based application packager, then the next server your business buys may be a licensed windows server for "you never know, it will run all these hip dockery things" reasons.

Unsorted