Init systems and service management

From Helpful
Jump to: navigation, search
These are primarily notes
It won't be complete in any sense.
It exists to contain fragments of useful information.



Contents

tl;dr, commands

SysV start/stop

# historically:
/etc/init.d/apache2 start


The script, when present, likely wraps at least sysv, so these days you can often also do:
service apache2 start

SysV enable/disable

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Note: SysV and its various imitations may vary a little in the details.

The basics are that there are symlinks

  • from directories under /etc/rc.d/ representing runlevels
  • to the scripts in /etc/init.d
  • names start with S10service and K40service, where S and K means start and kill, and the number is the priority within this runlevel


As it's work to do manually, the contents of these directories are

  • sometimes tweaked via convenience tools such as
    • update-rc.d (CLI)
    • chkconfig (CLI)
    • rcconf (CLI curses)
    • sysv-rc-conf (CLI curses)
    • bum (GUI)
    • jobs-admin (GUI)
  • sometimes computed from something else
e.g.

upstart start/stop

initctl interacts with upstart's init daemon.

initctl start apache2
initctl stop apache2


There are specific helpers scripts for less typing:

One-time start/stop:

start apache2
stop apache2
restart apache2
status apache2


(The script, when present, likely wraps upstart)

upstart enable/disable

Via basic configuration:

  • a service definition placed in
    /etc/init
    will be used
  • its startup requirements can be be configured
e.g. #commenting the start on line will effectively disable it.


(It's arguably a little cleaner to disable existing services via overrides. That is, you can create a
/etc/init/<service>.override
and having it contain
manual
, because (man page:) "If an override file is present, the stanzas it contains take precedence over those equivalently named stanzas in the corresponding configuration file [...]". This effectively allows local exceptions without touching the service definitions.)

systemd start/stop

systemctl talks to the systemd daemon.

There are some scripts around it.

systemctl start apache2
systemctl status apache2.service
# .service is optional


The script, when present, likely wraps systemd, as does RHEL/Centos's
chkconfig
- see below)

systemd enable/disable

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Enabling services basically means "putting it in a target (unit)" (read up for details), so

systemctl enable apache2
systemctl disable apache2

systemctl typically alters /etc/systemd/system/multi-user.target.wants/, which is similar to what historically was runlevel 3 [1].


Overview of all services:

# active services
systemctl list-units --type=service
# _all_ services
systemctl list-units --type=service --all


RHEL/Centos also has chkconfig, which seems to unify systemd (via systemctl), sysv, and xinetd.

It has historical origins like
service
.
chkconfig --list             
 
# add to startup. This will also turn them on for, by default, runlevels 2 through 5
chkconfig --add iptables
# remove
chkconfig --del ip6tables
# turn on/off at said runlevels
chkconfig mariadb on
# turn on/off at a specific runlevel
chkconfig ip6tables --level 2 off
 
# check whether configured for startup: add only service name
chkconfig network && echo "Network service is configured for startup"
# check whether configured for startup in specific runlevel: name and --level
chkconfig network --level 2 && echo "Network service is configured for startup in level 2"


Start/stop in all (in theory)

The
service
command was initially meant to run sysv scripts in a predictable environment rather than your shell's (verify).


Then people used it to wrap other things, also during transitions between systems, so if present it's likely to control sysv, upstart (initctl commands), systemd (systemctl commands), and openrc, whichever are applicable.

While you can't count on it always being there, it's convenient when it is.

service apache2 start
service apache2 stop
service apache2 restart
service apache2 status


To enable/disable services you still need the system-specific commands.



init/service systems in more detail

SysV-style init

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Runlevels and init scripts in imitation of how SYSV did it (and of little else from SYSV)

  • scripts in /etc/init.d/ are from used to start and control services
many controlling a single process.
  • These scripts can be run directly (e.g.
    /etc/init.d/apache start
    ).
  • ...but usually the basic idea is that sysv starts a particular service in a runlevel (mostly made to deal with order of services at boot time)


Most of these scripts stay short by using provisions from /sbin/runscript via a hashbang, which handles most of the boilerplate and means you only need to flesh out out a few major functions. It's also not unusual to see init scripts themselves use start-stop-daemon to start their target executable, e.g. for its pidfile-handling convenience.


From the perspective of some later, fancier alternatives, things that sysvinit lacks include:

dependencies - e.g. starting networking before a networked daemon can only be handled through ranking services and starting them in series
can't start things lazily (only when needed, also including dependencies)
can't start unrelated things in parallel


SysV init scripts

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Example scripts and snippets

upstart

Upstart was developed for ubuntu, as a more efficient and flexible alternative to (mostly) the above.

It adds things like on demand starting and restarting if the process dies,


Short story:

  • /etc/init/name.conf will get picked up
used automatically as their 'start on' stanzas control
and/or manually by you
will put a log in /var/log/upstart/name.log
(errors may go to syslog, though(verify))


See also:



introduction by example

Two existing examples:

description     "deferred execution scheduler"

start on runlevel [2345]
stop on runlevel [!2345]

expect fork
respawn

exec atd


description "SMB/CIFS File Server"

start on (local-filesystems and net-device-up)
stop on runlevel [!2345]

respawn

pre-start script
        RUN_MODE="daemons"
        [ -r /etc/default/samba ] && . /etc/default/samba
        [ "$RUN_MODE" = inetd ] && { stop; exit 0; }
        install -o root -g root -m 755 -d /var/run/samba
end script

exec smbd -F


Notes:

  • expect fork (and expect daemon) is about single-forking and double-forking processes
  • Instead of exec, you can specify a few lines of script
  • you can specify things to do before and afterwards (also often script-style)


I had at one point written:

description "Temporary autossh-like tunnel for remote access"
author "Me"

start on (local-filesystems and net-device-up IFACE=eth0)
stop on runlevel [016]

setuid worker
setgid worker

# respawn when it disconnects - indefinitely, but not fast, so it doesn't bother things too much
# (note: may still trigger denyhosts/fail2ban)
respawn
respawn limit 0 60

exec ssh -t -R 2222:localhost:22 -t -o "BatchMode=yes" -o "ExitOnForwardFailure=yes" tunneler@sshproxy.example.com

(Note: It sometimes failed to reconnect, I never checked out why)

More technical conf notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
What to run

Must have either

  • exec
    line or
  • a
    script
    section (run via /bin/sh)
May also have
pre-start
,
post-start
,
pre-stop
,
post-stop
(exec or script)



Telling when upstart should start and stop

Examples:

start on startup
start on runlevel [23]
start on runlevel [2345]
start on stopped rcS
start on started tty1
start on net-device-up IFACE=eth0
start on net-device-up IFACE!=lo
start on starting mountall
start on filesystem
start on local-filesystems
start on virtual-filesystems or static-network-up
start on (starting network-interface
          or starting network-manager
          or starting networking)


stop on runlevel [!2345]
stop on runlevel [!23]
stop on runlevel [06]
stop on (stopped network-interface JOB=$JOB INTERFACE=$INTERFACE
         or stopped network-manager JOB=$JOB
         or stopped networking JOB=$JOB)


Notes:

  • static-network-up
    • emitted by the /etc/network/if-up.d/upstart
    • when every interface configured as 'auto' in /etc/network/interfaces is up
(which in some cases is not what you want)
  • see also
    man upstart-events
State, return codes, and restarting
Unsorted

See also:



Where output goes
  • console log
    (default(verify)) - stdin to /dev/null, stdout and stderr to /var/log/upstart/name.log
  • console output
    - stdin, stdout, and stderr connected to a console
  • console owner
    - like console output, but the process becomes owner of the console
  • console none
    - everything to /dev/null


logging

loglevels are:

debug
info
message (default)
warn
error
fatal

So to see more you may want:

initctl log-priority info   # or maybe debug

These are basically the active filter on what gets stored in the logs(verify) so have no effect on earlier logging (also, to get this at boot, add --verbose or --debug to the kernel options[2])


See also: http://upstart.ubuntu.com/cookbook/#initctl-log-priority

debugging upstart jobs

See also:


init.d and upstart

systemd

Beyond its original goal of a better init (with a better, more flexible event/dependency system), systemd wants to handle most of the base system, including:

  • file system mount points [3]
  • log processing [4]
  • passwords
  • logins and terminals [5]
including process cleanup after logout (which makes screen / tmux more involved exception cases)
  • power management [6]
  • kernel DBus [7]
  • networking config [8]
  • local DNS[9]
  • date, time, time sync [10] [11]
  • virtualisation / containers [12]
  • sandboxing services [13]
  • sandboxing apps, wrapping apps into images
  • stateless systems, factory reset [14]


Criticism

So, there are roughly two takes on this.


There is a camp of people complaining that going so far beyond init breaks with the "do one thing well, and make it easy to combine" philosophy, that systemd is reinventing bugs that the systems it replaces solved years ago, doesn't always play well with other subsystems, is somewhat tied to its origin distros (mainly Redhat (though amusingly that's the easiest place to find old, less-capable systemd versions because of typical server update policies)), doesn't always care about the linux ecosystem as a whole (consider e.g. systemd dismissing bug reports from the kernel people), has a steep learning curve (to do things right), in part because the documentation isn't always clear.

It is currently easy to point at the rough edges.


There is another camp that points out the goal was never just a better init, it's to unify the system layer (i.e. "all the stuff that supports all your programs") sane enough that you would actually want to talk to it.

Because what we consider "the system" has grown in size and complexity anyway, and become harder to interact with. And while many parts became more automatic and dynamic (udev, automount, etc), each had their own learning curve and invited ignore-the-bordercase-bodginess (like "okay just sleep for a minute to hope the network is up, then run and forget" -- rather than the thing you probably want, to say like "hey please do this thing as soon as that interface comes up, thanks."). Basically because these parts were never really designed to be unified.

Yes, there is an argument that these things just needed better APIs -- or that systemd could have been a spec instead (much like POSIX and opendesktop). But this wasn't on the table in any real way.

So if you see a communicative system layer as a good idea, then systemd is at best a good solution, and at worst still a good push.


Overall, we'll see.

Personally, I've gotten over a bunch of my skepticism, but there is still plenty left (in part due to some current bugs, and inability to find various things in the docs, that on decade-older systems I can just do).



Units files

Service management is more on-demand event-based (more so than e.g. upstart) which also means booting can be faster.

Being built about dependencies end events, systemd configuration mostly consists of

  • unit files specifying what the parts are, and what they need
  • making them part of the active setup or not

systemd itself figures out what the combination means.


Units are ini-style text config files that represent of various system resources.

To most people, the interesting ones are service units.


There are a whole load of predefined units of most of the types, see e.g. https://www.freedesktop.org/software/systemd/man/systemd.special.html



Because it's sometimes useful to have arbitrary strings (anything except NUL), and paths (note: absolute only!) ) as part of unit names, particularly when autogenerated, there is a (reversible) string escaping from path to filename

As I can gather from systemd.unit:

  • if it's a path:
    • duplicate
      /
      are removed
    • /
      is a special case, returned as
      -
      , otherwise continue:
    • leading and trailing
      /
      are removed
  • all "/" character is replaced by "-"
  • all other characters (which are not ASCII alphanumeric or "_", and note this includes -) are replaced by "\xhh"-style escapes.
  • if the string starts with
    .
    that is replaced with
    \x2e

This is reversible. Note the inverse for paths is why it can only deal with absolute paths.

You can play with systemd-escape.






Where unit files go

Varies somewhat, for reasons listed in the documentation.

Basically, where
systemctl daemon-reload
can find them
.


For context:
systemctl daemon-reload
is the thing that figures out the dependencies.

For all targets listed in config files, it creates a cache, made of symlinks to the actual unit files (cache in the sense of 'most recent computed state'. Boot uses this too(verify)).


Okay, so, where does systemctl daemon-reload read unit files from?

The directories mentioned below are hardcoded at compile time, so are constant for a system (and typically for a distro) though not always between systems.

In --system mode (mostly the subject here)

  • Units from installed packages
can be either
/lib/systemd/system
or
/usr/lib/systemd/system
  • Runtime units
/run/systemd/system
takes precedence over the above(verify)
  • Local configuration (basically for any admin customizations)
/etc/systemd/system
if it has the same name as a /lib unit file, it overrides that (verify)
takes precedence over the above(verify)


Note that if you actually use multiple this, then you want to know about
systemctl cat


You can also get per-user systemd, allowing e.g. user services.

(it seems not all distros allow this, though(verify))

Launched via PAM. There is at most one per user - not per session. And only while they are logged in.

This adds a few directories

  • /usr/lib/systemd/user/ - from installed packages
  • /etc/systemd/user/ - user units from the admin
  • ~/.local/share/systemd/user/ - things you've installed to your homedir
  • ~/.config/systemd/user/ - your own

https://www.brendanlong.com/systemd-user-services-are-amazing.html




So what does systemctl enable do?

Creates a symlink in the appropriate target directory, because it represents the most recent dependency state.

This cache is typically in /etc/systemd/system/*.target.wants/ (and /etc/systemd/system/*.target.requires/?)


Note: Do a
systemctl daemon-reload
before e.g. an enable.


https://www.freedesktop.org/software/systemd/man/systemd.unit.html

https://www.freedesktop.org/software/systemd/man/systemd.special.html#Special%20System%20Units

service

Special system units:

  • dbus.service
  • display-manager.service

The display manager service. Usually, this should be aliased (symlinked) to gdm.service or a similar display manager service.



Create a file named something.service, with usually at least the sections [Unit], [Install], and [Service] (The first two are generic for unit files, the third is specific to services).

[Unit]
Description=Run a thing

[Install]
WantedBy=multi-user.target

[Service]
Type=oneshot
ExecStart=/usr/bin/thing
On type

Type is largely about when to consider it as not starting but actually started, which matters to inspecing status, and to units that list this one as a dependency

simple -
started process is the process we care about (it won't quit and won't fork). Any child processes are ignored.
The service is considered started as soon, and as long, as it runs.
idle' - like simple, also tries to delay startup to the end of the current transaction, and at most 5 seconds later. (Not meant as reliable ordering or serializing, still nice for some debug/cosmetics)
oneshot - like simple, but blocks until the first-started process stops, then goes to inactice
meant for services that aren't really services but some one-time tweak
discouraged in that you can't really tell whether it failed or not


forking - run, wait until it has forked off (typical for daemons) and exited before it is considered started. If the service can write a pidfile, you may wish to tell systemd about it (PIDFile=)


dbus - wait for the name set by BusName to appear on DBus
notify - signal via systemd itself, specifically sd_notify call


default is
simple when ExecStart= is specified (and Type= and BusName= are not)
oneshot when ExecStart= isn't specified (and Type= is not)



On the various states
On dependencies and ordering
  • Wants= - units will be started when this is. If any fail, we ignore it.
recommended for most services, for robustness. In that most single failed services should stop the entire system from trying to load.
  • Requires= - units will be started when this is. If any fail, we fail as well.
  • Requisite= - like Requies, but a dependency not having started means we fail, instead of starting them
  • BindsTo= - like Requires, plus when a dependency is stopped, everything that depends on it is stopped as well
  • PartOf= - when the unit listed here is stopped or restarted, this unit is as well.
  • PropagatesReloadTo=
  • Conflicts= - list units that are exclusive with us. Starting any unit within a conflict is means others are stopped.


You can define a want/requirement in both relevant units of a relationship:

Wants / Wantedby
Requires/ RequiredBy
BindsTo/BoundBy
PartOf / ConsistsOf
Requisite / RequisiteOf
PropagatesReloadTo / PropagatesReloadFrom

This is mostly pragmatics, e.g. in what happens when you remove you unit. For example, you'ld often use Wants for other services, whereas you'ld use WantedBy to become part of the target you want to be part of.



It seems like systemd will parallelize all dependency startups. Often you need something to run before you do, which is why you'ld regularly also add:

  • After=
  • Before=

Note that it often makes sense to use After when using Wants, Before when using WantedBy

It looks like you could easily use these without Wants/Require, for logic like "if a syslogger service is enabled then load after it, but if it isn't just go on" (verify)





Note also that certain cases make for automatic dependencies [15]

e.g. services with Type=dbus automatically get Requires=dbus.socket and After=dbus.socket


Execution and environment

The Exec* expect the executable to be an absolute path, mostly to avoid ambiguity.

In some cases you have to cheat, and can do so with

/bin/bash -c 'the thing you want'


  • ExecReload - command to run to restart (if absent)


Restarting after crashes

  • Restart [16]
    • no (default) - don't try
    • always
    • on-success - clean exit code
    • on-failure - unclean exit code, signal, timeout, or watchdog
    • on-abnormal - signal, timeout, or watchdog
    • on-abort - signal
    • on-watchdog - watchdog
  • RestartSec (default 100ms) - sleep before attempting restart
  • Also the more generic rate limiting from StartLimitIntervalSec and StartLimitBurst [17]
basically, if a unit attempts starting more than burst times within interval, the unit will no longer try to restart (note that a later manual restart resets this)



Periodic forced reload

E.g. when you have something with a bit of a memory leak you can easily restart at 4AM or whatnot.

Not a direct feature, but can be imitated if you can change Type=notify (and don't actually notify), and put the timeout to however often you want it to happen, e.g.:

Type=notify
WatchdogSec=10
Restart=always

https://www.freedesktop.org/software/systemd/man/systemd.service.html#WatchdogSec=


input/output/logging

https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Logging%20and%20Standard%20Input/Output


Security and sandboxing

You probably want to often use User= and Group= for effective user (name or ID), because it defaults to root.


There are a lot of things you can lock down, see: https://www.freedesktop.org/software/systemd/man/systemd.exec.html


Parametrizing

If the service name contains an @, like vpn@username.service, then that username becomes the instance name, which you can fetch using %i (escaped) or %I (unescaped)

(note that the escaped instance name is filesystem-safe(verify), so can also e.g. be used for pidfile names)

You might e.g. use this to pass through a single parameter. [18]


Multiple-process services
Conditions ans assertions

See also:

timer
mount
automount
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


You need

  • a .mount unit file
  • a .automount unit file
  • to name them using the systemd-escaped path. You probably want to use Template:Ininecode for this.


There is an alternative, to have this runtime-generated from fstab (done by systemd-fstab-generator, presumably at daemon-reload time).

The least you need for this is adding
x-systemd.automount
to options.

There are further options, like idle-disconnection (TODO: find an actual list, can't find it in documentation)



Explicit unit file

Alongside a (same-named(verify)) .mount

Automount means it can be mounted only once it's first accessed. (does it mean you should disable the mount and enable the automount?)

Supports parallelized or automatic mounting, when other units require it.


It also means that slow or missing (e.g. network) mounts don't hold up boot (unless required by something) but still get mounted eventually if they can.



Notes

Ordering information from fstab is discarded, so to do things like union mounts or bind mounts or such you need to use x-systemd.requires(verify), or write explicit unit files.


https://www.freedesktop.org/software/systemd/man/systemd.mount.html

https://www.freedesktop.org/software/systemd/man/systemd.automount.html

https://codingbee.net/tutorials/rhcsa/rhcsa-automounting-using-systemd-and-autofs


Debugging when it doesn't work

For me, items show up in

systemctl list-units --type=automount --all

as

loaded inactive dead

And specific status showed:

Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)


The logs showed nothing for the automount unit.

Logs did show:

systemd[1]: Dependency failed for Remote File Systems.
systemd[1]: Job remote-fs.target/start failed with result 'dependency'.

Which realy only seems to mean "didn't work"


Changing to the directory shows:

Couldn't chdir to /data/mystuff: No such device
This seems to be because systemd seems to inject its own mounty weirdness;
mount
shows
systemd-1 on /data/mystuff type autofs (rw,relatime,fd=34,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=568076)

The actual error, which only showed when I switched back to manual mounting, was that the password had expired.


Soooooo yeah looks like you're on your own here.

target

Used to create dependencies, mostly used to represent state of the whole system, and of subsystems that may depend on.

You, and services you write, often care mostly about:

  • multi-user.target
    (imitation of good ol' runlevel 3-ish)
  • graphical.target
    (in imitation of runlevel 5)


Special system units: Many.


Most targets are for checkpoints during boot (local-filesystem, remote-filesystems), or for specific subsystems (bluetooth, sound). To get an idea, see

systemctl list-units --type=target --all

Used whatever way you like, though - it's fairly easy to imitate a runlevel system with systemd.


Special system units:

Related to power state and/or boot state:

  • rescue.target - like emergency.target, but also pulls in basic boot, and filesystem mounts (so single user mode without most services). Can be used with kernel option 1, shorthand for
    systemd.unit=emergency.target
  • emergency.target - starts emergency shell on the console without anything else. Can be used with kernel option
    emergency
    , shorthand for
    systemd.unit=emergency.target
    ).
  • initrd-fs.target
  • initrd-root-device.target - reached when the root filesystem device is available, but before it has been mounted.
  • initrd-root-fs.target
  • default.target - the target systemd tries to go to, often a symlink to multi-user.target or graphical.target (can be overridden with systemd.unit= kernel option)
  • basic.target - basic boot-up.
  • cryptsetup.target - for encrypted block devices
  • remote-cryptsetup.target - like cryptsetup.target by for those from _netdev entries
  • local-fs.target - systemd-fstab-generator create mount units that depend on this
  • swap.target - like local-fs.target, but for swap partitions/files. See also #swap
  • remote-fs.target - like local-fs, for remote mountpoints
  • network-online.target
note that remote mountpoints automatically pull this in


  • kbrequest.target - systemd starts this target whenever Alt+ArrowUp is pressed on the console. Note that any user with physical access to the machine will be able to do this, without authentication, so this should be used carefully.
  • ctrl-alt-del.target - used when C+A+D is seen on console. Often a symlink to reboot.target
  • poweroff.target
  • exit.target - shutdown. basically the same as poweroff.target
  • shutdown.target - apparently the 'terminate services' part. By default services are hooked into this (see DefaultDependencies=yes)
  • reboot.target
  • final.target
  • halt.target
  • kexec.target - shutdown / rebooting via kexec
  • sigpwr.target - usually for UPS signals
  • suspend.target - A special target unit for suspending the system.
  • hibernate.target -
  • hybrid-sleep.target -
  • suspend-then-hibernate.target -
  • sleep.target - pulled in by suspend.target, hibernate.target, hybrid-sleep.target to centralize shared logic
  • multi-user.target - multiuser system, but non-graphical. Usually a step towards graphical.target
  • graphical.target - graphical login screen. Pulls in multi-user.target
  • system-update.target
  • system-update-pre.target
  • system-update-cleanup.service
used for offline system updates. See also systemd-system-update-generator
  • runlevel0.target -> poweroff.target
  • runlevel1.target -> rescue.target
  • runlevel2.target -> multi-user.target
  • runlevel3.target -> multi-user.target
  • runlevel4.target -> multi-user.target
  • runlevel5.target -> graphical.target
  • runlevel6.target -> reboot.target


Boot ordering, otherwise passive:

  • getty-pre.target
  • getty.target - local TTY
  • cryptsetup-pre.target
  • local-fs-pre.target
  • network.target
  • network-pre.target
  • nss-lookup.target
  • nss-user-lookup.target
  • remote-fs-pre.target
  • rpcbind.target
  • time-sync.target


Device-related - started when a relevant device becomes available [19]

  • bluetooth.target
  • printer.target
  • smartcard.target
  • sound.target


Setup for up other unit types:

  • paths.target
see #paths and https://www.freedesktop.org/software/systemd/man/systemd.path.html#
  • slices.target
See #slice and https://www.freedesktop.org/software/systemd/man/systemd.slice.html#
  • sockets.target
  • timers.target
See #timer and https://www.freedesktop.org/software/systemd/man/systemd.timer.html#


Other

  • machines.target - for containers/VMs See also
path

systemd can monitor paths (using inotify, for path-based activation of services.

https://www.freedesktop.org/software/systemd/man/systemd.path.html

socket

Often alongside a service


Special system units:

  • dbus.socket
note that units with Type=dbus automatically depend on this unit.
  • syslog.socket
userspace log messages will be made available on this socket
see also https://www.freedesktop.org/wiki/Software/systemd/syslog/


device

For devices (think udev, sysfs), specifically just the ones where ordering or mounting may be relevant so some hooks into systemd are necessary.

https://www.freedesktop.org/software/systemd/man/systemd.device.html#

swap

Describes system swap files/devices.

https://www.freedesktop.org/software/systemd/man/systemd.swap.html#

snapshot

Can save the state of systemd units, to later be restored by activating this unit.


slice

Basically a model around cgroups isolation.


Special system units:

  • -.slice
  • system.slice
  • user.slice
  • machine.slice


https://www.freedesktop.org/software/systemd/man/systemd.slice.html#

scope

Special system units:

  • init.scope - (active as long as the system is running)

Enabling, disable; start, stop

Status

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Status of a service, including recent log:

systemctl status autossh.service -l

All units:

systemctl list-units

Failed units:

systemctl --failed


init.d wrapping

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


systemd can map /etc/init.d/* into unit files at runtime, through systemd-sysv-generator, which is run at daemon-reload time.



The precise behaviour (fetching useful details from the LSB header, and what the fallbackss are, and where that changed and isn't entirely in line with the documentation) takes some digging to find out, see e.g. https://unix.stackexchange.com/questions/233468/how-does-systemd-use-etc-init-d-scripts

journald

Inspecting Logs

All logs:

journalctl

And you may like:

journalctl --no-pager    
journalctl -f           # folow


Filtering examples:

Errors from this boot:

journalctl -b -p err

Time section

journalctl --since 12:00 --until 12:30

Kernel messages:

journalctl -k


Logs for a service (/unit):

journalctl -u unitname
# which seems truncated so perhaps (1000 lines from this boot)
journalctl -u unitname -b -n1000 --no-pager

For the last it's possibly useful to see which units have logged stuff:

journalctl -F _SYSTEMD_UNIT


Some things you might expect to be special as they were in syslog may not be, and you may need to get at them like:

journalctl _COMM=cron


Fancier

http://0pointer.de/blog/projects/journalctl.html
Admin
Config
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The basic config file is

/etc/systemd/journald.conf

Though note that it is overridden by

/etc/systemd/journald.conf.d/*.conf
/run/systemd/journald.conf.d/*.conf
/usr/lib/systemd/journald.conf.d/*.conf


Disk or ram

  • Storage=volatile
goes to /run/log/journal (created if necessary)
  • Storage=persistent
goes to /var/log/journal (created if necessary), falling back to the above in certain cases
  • Storage=auto
if the target location (/var/log/journal) exist, go for persistent, otherwise fall back to volatile


Note that journald keeps file size limited.

Disk space limit

  • SystemMaxUse (default is 10% of size, capped at 4GB)
  • RuntimeMaxUse
  • SystemKeepFree (default is 15% of size, capped at 4GB)
  • RuntimeKeepFree

...and more settings like it (some of which also effectively control rotation).

Note that log settings starting with

System refers apply to persistent logs (/var/log/journal),
Runtime refers to /run/log/journal
On how systemd does logging

Systemd centralizes all logs, including:

  • libc syslog()
  • /dev/log
  • native protocol
  • services' stdout/stderr
  • kernel messages (printk())
  • dmesg

...in its journal daemon.


Where

Which stores to disk or RAM (according to journald.conf) Default behaviour has changed over time, and varies with distro.

Recent journald seems to itself default Storage=auto, meaning that

  • if /var/log/journal exists as a directory, it does persistent logging there,
  • otherwise it'll go to RAM, specifically under /run/log/journal


Fields

It'll store the text messages you're used to, but also a whole bunch of fields.[20]

Note that most fields are there only under certain conditions.


Standard and user-controllable fields are:

  • MESSAGE - text as you know it.
  • MESSAGE_ID - 128-bit identifier, recommended to be UUID.
  • PRIORITY - as in syslog: integer between 0 ("emerg") and 7 ("debug"),
  • SYSLOG_FACILITY - facility number
  • SYSLOG_IDENTIFIER - tag text
  • SYSLOG_PID - client's Process ID
  • CODE_FILE, CODE_LINE, CODE_FUNC - code where the message originates, if known and relevant
  • ERRNO -


Fields starting with an underscore cannot be altered by client code, and are added by the journal.

  • _PID, _UID, _GID - process, user, and group ID of the source process
  • _COMM=, _EXE=, _CMDLINE - name, executable path, and the command line of the source process
  • _BOOT_ID - boot ID the message came from
  • _MACHINE_ID - see machine-id(5)
  • _HOSTNAME - source host. Not so relevant unless you're aggregating
  • _TRANSPORT - how the message came here - syslog, journal (sysd's protocol), stdout (service output), kernel, driver (internal), audit
  • _STREAM_ID - (for stdout records), meant to make distinct service instantiation's stream of output identifiable
  • _LINE_BREAK - (for stdout records), whether the message ends with a \n
  • _SYSTEMD_UNIT - unit name
  • _SYSTEMD_CGROUP, _SYSTEMD_SLICE, _SYSTEMD_USER_UNIT, _SYSTEMD_SESSION, _SYSTEMD_OWNER_UID
  • _CAP_EFFECTIVE
  • _AUDIT_SESSION, _AUDIT_LOGINUID
  • _SELINUX_CONTEXT
  • _SOURCE_REALTIME_TIMESTAMP - earliest trusted timestamp of the message. In microseconds since the epoch UTC
  • _SYSTEMD_INVOCATION_ID

There are also a few more when the source is the kernel:

  • _KERNEL_DEVICE
  • _KERNEL_SUBSYSTEM
  • _UDEV_SYSNAME
  • _UDEV_DEVNODE
  • _UDEV_DEVLINK


Fields starting with double underscores are related to internal addressing, useful for serialization, aggregating:

  • __CURSOR
  • __REALTIME_TIMESTAMP - wallclock time of initial reception
  • __MONOTONIC_TIMESTAMP - basically, read [21]



https://wiki.archlinux.org/index.php/Systemd#Journal

https://www.freedesktop.org/software/systemd/python-systemd/journal.html


Notes on cooperating with other loggers:

https://www.freedesktop.org/wiki/Software/systemd/syslog/


Watching logs

e.g. "how do I get logwatch to work again?"


Options include:

  • poll journal contents via journalctl, e.g.
journalctl --since "1 day ago"
this can make sense for anything that reports per day-or-such, rather than live.


  • follow (stream) it with
journalctl -f
If you want it in a parseable form,
journalctl -f -o json


  • interface like python-systemd.
(That particular one was didn't understand rotations when I used it, so was not fit for streaming)


https://tim.siosm.fr/blog/2014/02/24/journald-log-scanner-python/

Some tools

Time taken for boot:

systemd-analyze

...per service:

systemd-analyze blame

...with some summary of which dependencies are holding others up

systemd-analyze critical-chain


Warnings and errors

Failed to execute operation: Too many levels of symbolic links

Apparently systemd used to refuse unit/service files that are symbolic links.

Update systemd.

If you can't, consider hardlinks. Or just copy files in.


service is not loaded properly: Exec format error

OpenRC

Like SysV, but cleaner and easier to handle. (verify)

https://wiki.gentoo.org/wiki/Project:OpenRC

https://en.wikipedia.org/wiki/OpenRC


Epoch

http://universe2.us/epoch.html

https://github.com/Subsentient/epoch


finit

http://troglobit.com/projects/finit/

SMF (solaris)

http://www.oracle.com/technetwork/articles/servers-storage-admin/intro-smf-basics-s11-1729181.html

launchd (osx)

https://en.wikipedia.org/wiki/Launchd

Some supporting concepts and utilities

runlevels

Runlevels came from sysv but were adopted

There are usually six levels, and they usually represent specific states of the system.

Over time, they have most usually meant something like:

  • 0 shutdown/halt
  • S/s/1 single-user, possibly all the same, possibly all mildly different, possibly only 1 exists
(s is n't a runlevel internally; it tells init "go to 1 but also do this specific thing")
used as a "everyone keep out while I fix low-level stuff" mode
  • 2 multi-user mode, no networking (or same as 3)
  • 3 multi-user mode
  • 4 not used (or same as 3)
  • 5 multi-user mode graphical interface (or same as 3)
  • 6 reboot


More recently it's usually simpler, something like

  • 0 shutdown
  • 1 single-user
  • 3 graphical multi-user mode
  • 6 reboot
and everything not named aliased to 3


And since that's only two actual states, one of which is rare, modern init systems have moved away from runlevels as such (also because the sysv way of administering them is a bit involved)


...but may still imitate them. For example, in systemd you may find:

runlevel0.target is a link to poweroff.target
runlevel1.target is a link to rescue.target
runlevel2.target is a link to multi-user.target
runlevel3.target is a link to multi-user.target
runlevel4.target is a link to multi-user.target
runlevel5.target is a link to graphical.target
runlevel6.target is a link to reboot.target

and e.g. booting to one via kernel parameter would be

systemd.unit=multi-user.target

where classically it was sticking the number of the runlevel on the end


See also:

pidfiles

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Pidfiles are used as an on-filesystem reminder of which process (by its PID) is which instance of that program, and are most typically used in service management.

It's simply a text file that stores the PID (sometimes followed by a newline, sometimes not), not unusually stored in /var/run, and not unusually with the .pid extension.

In effect, pidfiles are (filesystem-based) advisory locking, though they allow more than locking - you can fairly easily:

  • resolve a service to the process that currently represents it
  • check whether that service is (still) running
  • send signals to the the right process (reload, kill, etc.)
  • allow multiple instantiations of a program/service (without ambiguity)


If a program writes its own pidfile, you often want to be able to tell it where to write it. Aside from avoiding hardcoded assumptions, it also means that you can move all the responsibility for this to whatever does the service management. It's also basically necessary if you want to allow multiple instantiations.



Of course, this all relies on the assumption that the process a pidfile points at is actually the one that we started.

The most significant possible problem in working with pidfiles is that of a stale pidfile: if a pidfile was not removed when the service process quit (or crashed, as there's often nothing watching for that case, or whether the system didn't shut down cleanly), then the pidfile may not refer to a present process, or it may point to a completely unrelated process (and there usually isn't a simple way for process management to check the latter).

Stale pidfiles caused by crashes can be avoided by adding a guarding process, such as start-stop-daemon. Regardless of the way the service quit (cleanly and removed the pidfile, or not), the wrapping program can remove the pidfile as necessary.

It's generally handiest to have the service process itself create the pidfile when it starts and remove it when it quits. A guardian program is primarily useful as a fallback for only removal. You don't probably don't want the guarding process to create the pidfile. While it knows the PID of the process it started, that PID isn't always the one you want - consider forking daemons, apps being started though helper/wrapper scripts, and such.


To use pidfiles fairly robustly, consider the following:

  • having the service process itself create the pidfile, for the reason just mentioned
  • Use a guardian process, and note that you'll need to tell both it and the underlying app the pidfile path
  • having the service process itself remove the pidfile
    • at program exit, including signaled kills (also makes management around it easier)
    • preferably have it do so via an at-program-exit hook so that it will may also happen in controlled crashes
  • if the pidfile a program should write to exists, either complain and quit, or check it as well as we can, and remove it when applicable (when it doesn't point to a process, or we are sure it cannot be another instance of ourselves) before starting. This avoids non-startable daemons caused by stale pidfiles.
    • you can do this either in a possible manager around the service, (and/)or in the program itself. A program may be better at telling whether whether another copy of itself is running. A manager around it can often only give a warning like "Process might still be running. Check manually, remove /var/run/my.pid if it has stopped, and try again."



start-stop-daemon

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
start-stop-daemon
is a utility which can, among other things, keep track of a pidfile (created by what it starts, or created by itself), change effective user, daemonize, hand in environment variables, chroot, etc. It's a handy tool when starting services.

You can specify process(es) via --exec (full path), --pidfile, --user, and/or --name. The most specific/expressive is probably the pidfile. (Note that you can use start-stop-daemon to stop things we have not started ourselves)


Notes:

  • --start means that unless it already exists, the process is started (using --exec or --startas).
  • --stop means that if a process exists, a signal (see also --signal) will be sent to it.
  • return code depends on case and on use of --oknodo. Generally: 1 on error, 0 on success or when --oknodo is used.


  • if the executable you specify
    • fairly immediately terminates (forks off / starts something else in the background). (note that -m, telling start-stop-daemon to create a pidfile based on the process it runs, doesn't make sense in this case. Because of this, you usually want a --pidfile argument to the process.)
    • doesn't terminate, you can use -b to have start-stop-daemon itself be the thing that forks off, and it will be the parent of the real process. (You can use -m in this case).

Daemons that fork also mean that --exec won't match them, so some init features won't necessarily work, and unless you use one of the other features (e.g. security) you may forego start-stop-daemon.


See also:

tcpwrappers

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

'TCP wrappers' refers to a setup where a daemon (e.g. inetd, xinetd) acts as the network listener, and controls when to launch other network daemons. Basically, when it gets a connection, it runs a daemon.

One main reason was ease on admin: it lets you centralizes logging, and the logic to accept/reject on a per-host/net basis (which is the tcpd part(verify)).

(Another is that if accesses to services are rare, their resource is essentially zero while they're not running)


That said, some services were always better off running their own logic.

Also, logging is often easy enough anyway, and iptables often does fine for access control (and more centralized yet, really),

...so tcpwrappers is not very common anymore.


configuration files

hosts.allow, hosts.deny

unsorted

http://www.atnf.csiro.au/people/rgooch/linux/boot-scripts/ http://www.novell.com/documentation/suse91/suselinux-adminguide/html/ch13s04.html http://www.novell.com/coolsolutions/feature/15380.html



linux

http://www.atnf.csiro.au/people/rgooch/linux/boot-scripts/

gentoo

Uses a customized system.

/etc/runlevels, which is usually updated via rc-update, mentions init scripts, with symlinks that point to items in /etc/init.d/. (note the runlevels are defined in /etc/inittab)

http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=2&chap=4