Difference between revisions of "Systemd notes"

From Helpful
Jump to: navigation, search
m (Admin)
m (Admin)
Line 1,106: Line 1,106:
  journalctl --vacuum-size=500M
  journalctl --vacuum-size=500M
It seems you '''cannot''' inspect size per unit, limit size per unit, or remove logs by unit.
If one spammed, or you logged something secret, the only thing you can do is the above.
The only solution I've seen so far is [https://github.com/Mortal/cournal a third party script] that  
It seems you '''cannot'''
: inspect size per unit
: limit size per unit
: remove log entries by unit.
:: If one spammed, or you logged something secret, the only thing you can do is the above.
The only solution (to the last) I've seen so far is [https://github.com/Mortal/cournal a third party script] that  
: opens system.journal, copies out only the entries you want
: opens system.journal, copies out only the entries you want
: ...but that seems a bad idea on one it's currently writing to?
: ...but that seems a bad idea on one it's currently writing to?

Revision as of 13:37, 8 July 2020

Linux-related notes
Linux user notes

Shell, admin, and both:

Shell - command line and bash notes · shell login - profiles and scripts · Shells and execution ·· find and xargs and parallel · screen and tmux
Linux admin - disk and filesystem · Init systems and service management (upstart notes, systemd notes) · users and permissions · Debugging · security enhanced linux · health and statistics · kernel modules · YP notes · unsorted and muck

Logging and graphing - Logging · RRDtool and munin notes
Network admin - Firewalling and other packet stuff ·

Remote desktops
VNC notes
XDMCP notes


Beyond its original goal of a better init (with a better, more flexible event/dependency system), systemd wants to handle most of the base system, including:

  • file system mount points [1]
  • log processing [2]
  • passwords
  • logins and terminals [3]
including process cleanup after logout (which makes screen / tmux more involved exception cases)
  • power management [4]
  • kernel DBus [5]
  • networking config [6]
  • local DNS[7]
  • date, time, time sync [8] [9]
  • virtualisation / containers [10]
  • sandboxing services [11]
  • sandboxing apps, wrapping apps into images
  • stateless systems, factory reset [12]


So, there are roughly two takes on this.

There is a camp of people complaining that going so far beyond init breaks with the "do one thing well, and make it easy to combine" philosophy, that systemd is reinventing bugs that the systems it replaces solved years ago, is somewhat tied to its origin distros (mainly Redhat (though amusingly that's the place you most filed the older, less-capable systemd versions due to typical RH server update policies)), doesn't always care to play well with other subsystems, or with the linux ecosystem as a whole (consider e.g. systemd dismissing bug reports from the kernel people), is making responsibilities vaguer, has a steeper learning curve (to do things right), sometimes makes overview vaguer, sometimes has a less than ideal interface, the commands are longer and harder to remember, while also having minimal, often somewhat unclear documentation.

So it is currently easy to point at its rough edges.

There is another camp that points out the goal was never just a better init, it's to unify the system layer (i.e. "all the stuff that supports all your programs") in a way that is coherent enough that you would actually want to talk to it.

Because what we consider "the system" has grown in size and complexity anyway, has become more automatic and dynamic (udev, automount, etc) and from that has become harder to interact with, and having a bunch of separate systems like that invites all the duct taped bodginess (like "okay just sleep for a minute to hope the network is up, then run and forget" -- rather than the thing you probably want, to say like "hey please do this thing as soon as that interface comes up, thanks."). Basically because these parts were never really designed to be unified (failing the second part of the previously mentioned philosophy).

And sure, ther's an argument that these things just needed better APIs -- or that systemd could have been a spec instead (much like POSIX or opendesktop). But this wasn't on the table in any real way.

So if you see a communicative system layer as a good idea, then systemd is at best a decent solution, and at worst still a good push.

Overall, we'll see.

Personally, I've both gotten over much of my skepticism, but there is still plenty left (in part due to current bugs, and inability to solve things or even find out how it's supposed to work, or what version a particular example doesn't work in. It's easy to be "how is this better?"-grumpy when I know how to fix/bodge it on a decades-older system).

Units files

Units are the varied system resources that can interdepend.

They come in various types of units, from devices, to services, to timers (triggering sense), to mount paths, to targets (often 'a set of dependencies useful to name'), and a handful more.

...though you will probably mostly use services. Also 'timers if you want to try doing things with less/no cron. And maybe automounts.

Unit files are ini-style text config files that describe resources, and what they depend on; systemd itself figures out what the combination means and when to do something.

Note that not all units have explicit files - there's a bunch of automatic generation going on.

On unit names

Because it's sometimes useful to have arbitrary strings (anything except NUL) as part of unit names in particular when they are paths and/or autogenerated, and there is value in reversing them) there is a reversible string escaping.

You can play with
to get some sense of it.

As I can gather from
man systemd.unit
  • if it's a path:
    • duplicate
      are removed
    • /
      is a special case, returned as
      , otherwise continue:
    • leading and trailing
      are removed
  • all "/" character is replaced by "-"
  • all other characters (which are not ASCII alphanumeric or "_", and note this includes -) are replaced by "\xhh"-style escapes.
  • if the string starts with
    that is replaced with

This is reversible.

Note the inverse for paths is why it can only deal with absolute paths.
Where unit files go

Varies somewhat, for reasons listed in the documentation.

Basically, where
systemctl daemon-reload
can find them

For context:
systemctl daemon-reload
is the thing that figures out the dependencies.

For all targets listed in config files, it creates a cache, made of symlinks to the actual unit files (cache in the sense of 'most recent computed state'. Boot uses this too(verify)).

Sooo that's not an answer. Where does systemctl daemon-reload read unit files from?

The directories mentioned below are hardcoded at compile time, so are constant for a system (and typically for a distro) though not always between systems.

In --system mode (mostly the subject here)

  • Units from installed packages
can be either
  • Runtime units
takes precedence over the above(verify)
  • Local configuration (basically for any admin customizations)
if it has the same name as a /lib unit file, it overrides that (verify)
takes precedence over the above(verify)

Note that if you actually use multiple this, then you want to know about
systemctl cat

You can also get per-user systemd, allowing e.g. user services.

(it seems not all distros allow this, though(verify))

Launched via PAM. There is at most one per user - not per session. And only while they are logged in.

This adds a few directories

  • /usr/lib/systemd/user/ - from installed packages
  • /etc/systemd/user/ - user units from the admin
  • ~/.local/share/systemd/user/ - things you've installed to your homedir
  • ~/.config/systemd/user/ - your own


So what does systemctl enable do?

Creates a symlink in the appropriate target directory, because it represents the most recent dependency state.

This cache is typically in /etc/systemd/system/*.target.wants/ (and /etc/systemd/system/*.target.requires/?)

Note: Do a
systemctl daemon-reload
before e.g. an enable.




Create a file named something.service, with usually at least the sections [Unit], [Install], and [Service] (The first two are generic for unit files, the third is specific to services), e.g.

Description=Run a thing



On type

Type matters to when to consider it as not starting but actually started -- which matters to units that list this one as a dependency, and when inspecing status.

  • simple - started process is the process we care about (it won't quit and won't fork)
Any child processes are ignored.
The service is considered started as soon, and as long, as it runs.
  • idle - like simple, also tries to delay startup to the end of the current transaction, and at most 5 seconds later.
(nice for some debug/cosmetics, not meant as reliable ordering or serializing)

  • oneshot - like simple, but blocks until the first-started process stops. Then goes to inactice.
meant for cases that are more one-time commands than services - but are easy to hook in this way
discouraged in that you can't really tell whether it failed or not

  • forking
it considered started once the command we run has forked off and exited (classical daemons often do this)
If the service can write a pidfile, you may wish to tell systemd about it (PIDFile=) so it can tell you something about the status

  • dbus - wait for the name set by BusName to appear on DBus
  • notify - signal via systemd itself, specifically sd_notify call

The default is

simple when ExecStart= is specified (and Type= and BusName= are not)
oneshot when ExecStart= isn't specified (and Type= is not)

On the various states
On dependencies and ordering
  • Wants= - units will be started when this is. If any fail, we ignore it.
recommended for most services, for robustness. In that most single failed services should stop the entire system from trying to load.
  • Requires= - units will be started when this is. If any fail, we fail as well.
  • Requisite= - like Requies, but a dependency not having started means we fail, instead of starting them
  • BindsTo= - like Requires, plus when a dependency is stopped, everything that depends on it is stopped as well
  • PartOf= - when the unit listed here is stopped or restarted, this unit is as well.
  • PropagatesReloadTo=
  • Conflicts= - list units that are exclusive with us. Starting any unit within a conflict is means others are stopped.

You can define a want/requirement in both relevant units of a relationship:

Wants / Wantedby
Requires/ RequiredBy
PartOf / ConsistsOf
Requisite / RequisiteOf
PropagatesReloadTo / PropagatesReloadFrom

This is mostly pragmatics, e.g. in what happens when you remove you unit. For example, you'ld often use Wants for other services, whereas you'ld use WantedBy to become part of the target you want to be part of.

It seems like systemd will parallelize all dependency startups. Often you need something to run before you do, which is why you'ld regularly also add:

  • After=
  • Before=

Note that it often makes sense to use After when using Wants, Before when using WantedBy

It looks like you could easily use these without Wants/Require, for logic like "if a syslogger service is enabled then load after it, but if it isn't just go on" (verify)

Note also that certain cases make for automatic dependencies [13]

e.g. services with Type=dbus automatically get Requires=dbus.socket and After=dbus.socket

Execution and environment

The Exec* expect the executable to be an absolute path, mostly to avoid ambiguity.

In some cases you have to cheat, and can do so with

/bin/bash -c 'the thing you want'

  • ExecReload - command to run to restart (if absent)

Restarting after crashes

  • Restart [14]
    • no (default) - don't try
    • always
    • on-success - clean exit code
    • on-failure - unclean exit code, signal, timeout, or watchdog
    • on-abnormal - signal, timeout, or watchdog
    • on-abort - signal
    • on-watchdog - watchdog
  • RestartSec (default 100ms) - sleep before attempting restart
  • Also the more generic rate limiting from StartLimitIntervalSec and StartLimitBurst [15]
basically, if a unit attempts starting more than burst times within interval, the unit will no longer try to restart (note that a later manual restart resets this)

Periodic forced reload

E.g. when you have something with a bit of a memory leak you can easily restart at 4AM or whatnot.

Not a direct feature, but can be imitated if you can change Type=notify (and don't actually notify), and put the timeout to however often you want it to happen, e.g.:





Security and sandboxing

You probably want to often use User= and Group= for effective user (name or ID), because it defaults to root.

There are a lot of things you can lock down, see: https://www.freedesktop.org/software/systemd/man/systemd.exec.html


If the service name contains an @, like vpn@username.service, then that rest of the name becomes the instance name, which you can fetch using

 %i escaped form
 %I unescaped form

(note that the escaped instance name is filesystem-safe(verify), so can also e.g. be used for pidfile names)

You might e.g. use this to pass through a single parameter. [16]

Multiple-process services
Conditions ans assertions

See also:

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

These unit files are a relatively straight imitation of fstab entries.

Where e.g. where fstab says

/dev/sda /mnt/first ext4 defaults 0 0

a .mount file (relevant section) might say:


While you could write these youself, it may be easier to have systemd pick them up from fstab (via
) so you can keep using fstab.

Note that

  • if you write them yourself, the unit name must be the (systemd-encoded) mount path.
  • If both exist, the unit file takes precedence.


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

You need

  • a .mount unit file
  • a .automount unit file
  • to name them using the systemd-escaped path. You probably want to use
    for this.

There is an alternative, to have this runtime-generated from fstab (done by systemd-fstab-generator, presumably at daemon-reload time).

The least you need for this is adding
to options.

There are further options, like idle-disconnection (TODO: find an actual list, can't find it in documentation)

Explicit unit file

Alongside a (same-named(verify)) .mount

Automount means it can be mounted only once it's first accessed. (does it mean you should disable the mount and enable the automount?)

Supports parallelized or automatic mounting, when other units require it.

It also means that slow or missing (e.g. network) mounts don't hold up boot (unless required by something) but still get mounted eventually if they can.


Ordering information from fstab is discarded, so to do things like union mounts or bind mounts or such you need to use x-systemd.requires(verify), or write explicit unit files.




Debugging when it doesn't work

For me, items show up in

systemctl list-units --type=automount --all


loaded inactive dead

And specific status showed:

Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)

The logs showed nothing for the automount unit.

Logs did show:

systemd[1]: Dependency failed for Remote File Systems.
systemd[1]: Job remote-fs.target/start failed with result 'dependency'.

Which realy only seems to mean "didn't work"

Changing to the directory shows:

Couldn't chdir to /data/mystuff: No such device
This seems to be because systemd seems to inject its own mounty weirdness;
systemd-1 on /data/mystuff type autofs (rw,relatime,fd=34,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=568076)

The actual error, which only showed when I switched back to manual mounting, was that the password had expired.

Soooooo yeah looks like you're on your own here.


Used to create dependencies, mostly used to represent state of the whole system, and of subsystems that may depend on.

Often "a group of things you care to name".

A good number of targets already there are for checkpoints during boot (local-filesystem, remote-filesystems), or for specific subsystems (bluetooth, sound). To get an idea, see

systemctl list-units --type=target --all

Used whatever way you like, though - it's fairly easy to imitate a runlevel system with systemd.

Sometimes symbolizing specific actions. See e.g. these special cased targets, also mentioned below)

Note that People who write higher-level services mostly care about:

  • multi-user.target
    (imitation runlevel 3(-ish))

and maybe

  • graphical.target
    (in imitation of runlevel 5)

Special system units:

Related to power state and/or boot state:

  • rescue.target - like emergency.target, but also pulls in basic boot, and filesystem mounts (so single user mode without most services). Can be used with kernel option 1, shorthand for
  • emergency.target - starts emergency shell on the console without anything else. Can be used with kernel option
    , shorthand for
  • initrd-fs.target
  • initrd-root-device.target - reached when the root filesystem device is available, but before it has been mounted.
  • initrd-root-fs.target
  • default.target - the target systemd tries to go to, often a symlink to multi-user.target or graphical.target (can be overridden with systemd.unit= kernel option)
  • basic.target - basic boot-up.
  • cryptsetup.target - for encrypted block devices
  • remote-cryptsetup.target - like cryptsetup.target by for those from _netdev entries
  • local-fs.target - systemd-fstab-generator create mount units that depend on this
  • swap.target - like local-fs.target, but for swap partitions/files. See also #swap
  • remote-fs.target - like local-fs, for remote mountpoints
  • network-online.target
note that remote mountpoints automatically pull this in

  • kbrequest.target - systemd starts this target whenever Alt+ArrowUp is pressed on the console. Note that any user with physical access to the machine will be able to do this, without authentication, so this should be used carefully.
  • ctrl-alt-del.target - used when C+A+D is seen on console. Often a symlink to reboot.target
  • poweroff.target
  • exit.target - shutdown. basically the same as poweroff.target
  • shutdown.target - apparently the 'terminate services' part. By default services are hooked into this (see DefaultDependencies=yes)
  • reboot.target
  • final.target
  • halt.target
  • kexec.target - shutdown / rebooting via kexec
  • sigpwr.target - usually for UPS signals
  • suspend.target - A special target unit for suspending the system.
  • hibernate.target -
  • hybrid-sleep.target -
  • suspend-then-hibernate.target -
  • sleep.target - pulled in by suspend.target, hibernate.target, hybrid-sleep.target to centralize shared logic
  • multi-user.target - multiuser system, but non-graphical. Usually a step towards graphical.target
  • graphical.target - graphical login screen. Pulls in multi-user.target
  • system-update.target
  • system-update-pre.target
  • system-update-cleanup.service
used for offline system updates. See also systemd-system-update-generator
  • runlevel0.target -> poweroff.target
  • runlevel1.target -> rescue.target
  • runlevel2.target -> multi-user.target
  • runlevel3.target -> multi-user.target
  • runlevel4.target -> multi-user.target
  • runlevel5.target -> graphical.target
  • runlevel6.target -> reboot.target

Boot ordering, otherwise passive:

  • getty-pre.target
  • getty.target - local TTY
  • cryptsetup-pre.target
  • local-fs-pre.target
  • network.target
  • network-pre.target
  • nss-lookup.target
  • nss-user-lookup.target
  • remote-fs-pre.target
  • rpcbind.target
  • time-sync.target

Device-related - started when a relevant device becomes available [17]

  • bluetooth.target
  • printer.target
  • smartcard.target
  • sound.target

Setup for up other unit types:

  • paths.target
see #paths and https://www.freedesktop.org/software/systemd/man/systemd.path.html#
  • slices.target
See #slice and https://www.freedesktop.org/software/systemd/man/systemd.slice.html#
  • sockets.target
  • timers.target
See #timer and https://www.freedesktop.org/software/systemd/man/systemd.timer.html#


  • machines.target - for containers/VMs See also

systemd can monitor paths (using inotify, for path-based activation of services.



Often alongside a service

Special system units:

  • dbus.socket
note that units with Type=dbus automatically depend on this unit.
  • syslog.socket
userspace log messages will be made available on this socket
see also https://www.freedesktop.org/wiki/Software/systemd/syslog/


For devices (think udev, sysfs), specifically just the ones where ordering or mounting may be relevant so some hooks into systemd are necessary.



Describes system swap files/devices.



Can save the state of systemd units, to later be restored by activating this unit.


Basically a model around cgroups isolation.

Special system units:

  • -.slice
  • system.slice
  • user.slice
  • machine.slice



Special system units:

  • init.scope - (active as long as the system is running)

Enabling, disable; start, stop


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Status of a service, including recent log:

systemctl status autossh.service -l

All units:

systemctl list-units

Failed units:

systemctl --failed

init.d wrapping

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

systemd can map /etc/init.d/* into unit files at runtime, through systemd-sysv-generator, which is run at daemon-reload time.

The precise behaviour (fetching useful details from the LSB header, and what the fallbacks are, and where that changed and isn't entirely in line with the documentation) takes some digging to find out, see e.g. https://unix.stackexchange.com/questions/233468/how-does-systemd-use-etc-init-d-scripts


Inspecting Logs

All logs:


And you may like:

journalctl --no-pager    
journalctl -f           # folow

Filtering examples:

Errors from this boot:

journalctl -b -p err

Time section

journalctl --since 12:00 --until 12:30

Kernel messages:

journalctl -k

Logs for a service (/unit):

journalctl -u unitname
# which seems truncated so perhaps (1000 lines from this boot)
journalctl -u unitname -b -n1000 --no-pager

For the last it's possibly useful to see which units have logged stuff:

journalctl -F _SYSTEMD_UNIT

Some things you might expect to be special as they were in syslog may not be, and you may need to get at them like:

journalctl _COMM=cron


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

See total log size:

journalctl --disk-usage

Gives something like

Archived and active journals take up 704.0M in the file system.

Clean up archived logs, by backlog time:

journalctl --vacuum-time=2d

or total size

journalctl --vacuum-size=500M

It seems you cannot

inspect size per unit
limit size per unit
remove log entries by unit.
If one spammed, or you logged something secret, the only thing you can do is the above.

The only solution (to the last) I've seen so far is a third party script that

opens system.journal, copies out only the entries you want
...but that seems a bad idea on one it's currently writing to?

Also, this works on archived, not active logs logs. If you want to effectively work on everything, first do a:

journalctl --rotate


This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The basic config file is


Though note that it is overridden by


Disk or ram

  • Storage=volatile
goes to /run/log/journal (created if necessary)
  • Storage=persistent
goes to /var/log/journal (created if necessary), falling back to the above in certain cases
  • Storage=auto
if the target location (/var/log/journal) exist, go for persistent, otherwise fall back to volatile

Note that journald keeps file size limited.

Disk space limit

  • SystemMaxUse (default is 10% of size, capped at 4GB)
  • RuntimeMaxUse
  • SystemKeepFree (default is 15% of size, capped at 4GB)
  • RuntimeKeepFree

...and more settings like it (some of which also effectively control rotation).

Note that log settings starting with

System refers apply to persistent logs (/var/log/journal),
Runtime refers to /run/log/journal
On how systemd does logging

Systemd centralizes many log sources, including:

  • libc syslog()
  • /dev/log
  • kernel messages (printk())
  • dmesg
  • its own services' stdout/stderr
  • native protocol

...in its journal daemon.


Which stores to disk or RAM (according to journald.conf)

Default behaviour varies with distro and has changed over time.

Recent journald seems to itself default Storage=auto, meaning that

  • if /var/log/journal exists as a directory, it does persistent logging there
  • otherwise it'll go to RAM (specifically via /run/log/journal)


It'll store the text messages you're used to, plus a bunch of extra fields.[18]

Note that most fields are there only under certain conditions.

The more standard and user-controllable fields include:

  • MESSAGE - text as you know it.
  • MESSAGE_ID - 128-bit identifier, recommended to be UUID.
  • PRIORITY - as in syslog: integer between 0 ("emerg") and 7 ("debug"),
  • SYSLOG_FACILITY - facility number
  • SYSLOG_IDENTIFIER - tag text
  • SYSLOG_PID - client's Process ID
  • CODE_FILE, CODE_LINE, CODE_FUNC - code where the message originates, if known and relevant
  • ERRNO -

Fields starting with an underscore cannot be altered by client code, and are added by journald.

  • _PID, _UID, _GID - process, user, and group ID of the source process
  • _COMM=, _EXE=, _CMDLINE - name, executable path, and the command line of the source process
  • _BOOT_ID - boot ID the message came from
  • _MACHINE_ID - see machine-id(5)
  • _HOSTNAME - source host. Not so relevant unless you're aggregating
  • _TRANSPORT - how the message came here - syslog, journal (sysd's protocol), stdout (service output), kernel, driver (internal), audit
  • _STREAM_ID - (for stdout records), meant to make distinct service instantiation's stream of output identifiable
  • _LINE_BREAK - (for stdout records), whether the message ends with a \n
  • _SYSTEMD_UNIT - unit name
  • _SOURCE_REALTIME_TIMESTAMP - earliest trusted timestamp of the message. In microseconds since the epoch UTC

There are also a few more when the source is the kernel:


Fields starting with double underscores are related to internal addressing, useful for serialization, aggregating:

  • __CURSOR
  • __REALTIME_TIMESTAMP - wallclock time of initial reception
  • __MONOTONIC_TIMESTAMP - basically, read [19]



Notes on cooperating with other loggers:


Watching logs

e.g. "how do I get logwatch to work again?"

Options include:

  • poll journal contents via journalctl, e.g.
journalctl --since "1 day ago"
this can make sense for anything that reports per day-or-such, rather than live.

  • follow (stream) it with
journalctl -f
If you want it in a parseable form,
journalctl -f -o json

  • interface like python-systemd.
(That particular one was didn't understand rotations when I used it, so was not fit for streaming)


Some tools

Time taken for boot:


...per service:

systemd-analyze blame

...with some summary of which dependencies are holding others up

systemd-analyze critical-chain

Warnings and errors

Failed to execute operation: Too many levels of symbolic links

Apparently systemd used to refuse unit/service files that are symbolic links.

Update systemd.

If you can't, consider hardlinks. Or just copy files in.

service is not loaded properly: Exec format error