Logging

From Helpful
Jump to: navigation, search

Shell, admin, and both:

Shell - command line and bash notes · shell login - profiles and scripts ·· find and xargs and parallel · screen and tmux
Linux admin - disk and filesystem · users and permissions · Debugging · security enhanced linux · health and statistics · kernel modules · YP notes · unsorted and muck
Logging and graphing - Logging · RRDtool and munin notes
Network admin - Firewalling and other packet stuff ·


Remote desktops
VNC notes
XDMCP notes



This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Logging serves various goals.

It can be a handy while debugging, to see state at various stages, particularly in situations where attaching a debugger is impractical, and particularly in cases where even a simple print won't end up in an obvious place, or even anywhere at all (e.g. web services, background daemons).

You can argue whether logging states help or interfere with code readability, mostly because it can be used both in clarifying and in rather opaque ways.


Apps that do neat logging are also liked by sysadmins that watch their systems -- though most will use way of reducing information overload anyway. Often this means configurable and redirectable filtering, summaries, and actions taken only on specific errors - very usually by some additional log analysis application.


Log levels/severity

As random reference and collection.

The below is a union of the various choices; most logging systems use a subset of these, usually half a dozen at most (Using all of the below is probably too complex to use consistently or read well.):

  • TRACE, is often often used to report small steps in possibly excruciating detail - more so than DEBUG. Usually under a condition variable, but is often only used when a bug is suspected/known)
  • DEBUG, often used to see whether a particular state or point in code is reached, what value a variable had at some point, or such. Not likely to be emitted by a production version, and likely to be dropped by syslogging by default.
  • INFO: Pretty neutral, often used for "doesn't call for action, but may be interesting to someone reading the logs, for details on how the app is doing its thing, but"
  • NOTICE: something that is probably an expectable part of day-to-day operations, but may merit some attention if you can spare it. If not present, INFO takes the role fairly well.)
  • WARNING: "this may signal a problem or something that may lead to a problem (and perhaps it is avoidable or recoverable)"
  • ERROR: often used for "a real problem happened", which may be temporary but probably meant a user didn't see what they expected, and quite possibly an error instead. Often used in a "shit happened, maybe you can avoid it next time if you see this information" way.
  • ALERT (e.g. in syslog), something that strongly suggests some attention/action, probably by the sysadmin)
  • CRITICAL/FATAL, often signals that a process or (part of or whole) application failed and stopped. Typically demands attention.
  • EMERGENCY: if present it often signals that the system has become unstable, or will soon become so soon if the issue behind this is ignored)

syslogging

Sysloggers are daemons that you can expect to be installed and running on all *nix systems.


All use local sockets (e.g. on /dev/log), most (but not all) also allow you to listen on network sockets.

Network-daemon sysloggers (UDP port 514) can be useful to log from embedded systems, to centralize logging for many servers. The protocol is often the BSD syslog protocol (RFC 3164), or the newer IETF syslog protocol (from 2007?).

Note that sockets make it easy to avoid race conditions when many threads or many processes want to log to the same target. Logging libraries may not be set up to be safe this way. Various logging systems are network-based for largely this reason.


Useful for quick and dirty debugging, when you need output to any old watchable place, and installing/learning a serious logging system would be more work.

At most you have to tweak its configuration to make it easier to watch what you want to watch.


Note that sysloggers usually buffer log messages to be nice in terms of IO (and resulting latency), which is usually what you want. It does mean that a system crashing fast enough may not be able to log why. Various sysloggers can be told to buffer less, or not at all, but you ought to do so only when you know why that's bad for general use.


Facility

Syslog defines a bunch of facilities, which you can take as 'one of properties that a syslogger can use to send to separate files':

  • LOG_KERN - kernel messages
  • LOG_LPR, LOG_MAIL, LOG_NEWS, LOG_UUCP, LOG_CRON - used by the respective subsystems (print, mail, UUCP, cron)
  • LOG_DAEMON - system daemons
  • LOG_AUTH - authentication/authorization related
  • LOG_USER - applications or processes (often the default for syslog calls)
  • LOG_LOCAL0
  • LOG_LOCAL1
  • LOG_LOCAL2
  • LOG_LOCAL3
  • LOG_LOCAL4
  • LOG_LOCAL5
  • LOG_LOCAL6
  • LOG_LOCAL7

As LOG_USER is often the default for logging libraries, you specifying facility is usually optional.


Modern sysloggers often allow you to filter and split off messages according to facility - and many others things. Doing so based on facility will use slightly less CPU than splitting it based on message content.

On servers that you host a few major apps on, you could consider using one of LOCAL0..LOCAL7 for them. However, this only makes sense when they're configurable and not hardcoded into the apps.


logger
is a command line interface to syslog, which can be handy to test such a configuration, for example:
logger -p local0.notice Command line test to LOCAL0

In applications, you often need to to specify some constant to do so. For example, in python (see also [1]):

syslog.syslog(syslog.LOG_LOCAL0, 'Python test to LOCAL0')


Software

Sysloggers

syslog-ng

Allows splitting into various files based on facility, program, regexp (faster substring too?(verify)), and more.


http://wiki.egee-see.org/index.php/SyslogNG http://www.softpanorama.org/Logs/Syslog_ng/filter_functions_with_examples.shtml http://www.softpanorama.org/Logs/Syslog_ng/configuration_examples.shtml


rsyslog

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Three different kinds of filters:

  • "traditional" severity and facility based selectors, e.g.
auth,authpriv.*                 /var/log/auth.log
local0.*                        /var/log/local0.log
mail.*                          -/var/log/mail.log
# all facilities, all severities:
*.*                             -/var/log/syslog
# filter out specific things:
mail.none;authpriv.none;cron.none -/var/log/messages
Upsides: fast matching, brief rules
Downsides: no match by program name, not message inspection, etc.


The other two are more flexible, but longe to specify, and can be somewhat slowr

  • property-based
:programname,startswith,"postgres"   -/var/log/postgres.log
:msg, contains, "problem"


  • RainerScript-based
if $programname == 'popa3d' and $syslogseverity <= '6' then /var/log/popa3d.log
if $programname == 'popa3d' and $syslogseverity <= '6' then ~


Note that

  • ~ is discard, which can be useful in cases like "I want matches for this to be in this log and absolutely nowhere else"
it has been renamed stop, ~ is now deprecated


On templates


See also: http://www.rsyslog.com/filter-optimization-with-arrays/

http://www.rsyslog.com/doc/property_replacer.html -->

metalog

No network listener. -->

log4j and family

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

See also:

logcheck

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Simple tool to read, filter, and then act on specific log entires.

Sysadmins use it to get briefer summaries, automatic notifications, and such.

See also:


logrotate

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

When your logger doesn't rotate, or you want non-syslog things to rotate, you want to look at tools like logrotate.

Is typically run from cron.daily, which is often fast enough.


Conditions of rotation

Mostly based on time and/or size:

daily
/
weekly
/
monthly
/
yearly
control whether logrotate will consider rotating, based on whether the logfile's creation timestamp is at least that long ago.

If this is the only condition, it will always rotate at this interval. This can e.g. be handy to avoid old output in a grep in /var/log/*.log


If you have a time condition and
minsize
, the result is rotation only when it's this old and at least a certain size.

This can be handy if you want to avoid huge files, but also avoid a huge amount of tiny files. (It's not so handy if you want only recent entries, because slow-growing logs will stay around until they reach a size.)


If you want to base it on size and nothing else, you can use
size n
.

(Using this seems to imply ignoring time conditions. it seems use of minsize without mention of a time condition is effectively equivalent)

This can be handy for logs that grow very quickly.


Note that most conditions means rotated logs do not represent regular intervals, so do not blindly assume you have exactly as much backlog as (time interval * the number in the rotate setting) suggests.

You can use k, M, and G (on minsize and size).


What to do with the log file

If you want logrotate to create a new log file after moving the old one, use
create mode owner group
.

This may make it interesting to get the right permissions.

The more minimal and sometimes better solution is to not create a new log file, and tell the according daemon (syslogger or other) to do so. This means it can get the owner and permission right (should it matter).

signalling often via a
prerotate
and/or
postrotate
. These signal a section of scripting ended by an
endscript

A minimal logrotate moves the log file, but does not create a new one. If instead you want logrotate to create a new file (some systems make this the default) with the same name after rotation, you can

Another third option is to not move the file at all, just copy its contents and truncate the source (see
copytruncate
).

This is slower (moving the file is a filesystem rename, copying contents can involve a lot of data), but can be handy if it's impossible to signal the daemon to reopen the log, and not preferable to restart the daemon.


What to do with the old logs

What to do with the rotated log, and with older log files:

  • olddir dirname
    - place rotated logs in directory called this. Assumed to be relative, a subdirectory, unless it is an absolute path. It seems it won't be created for you, so it has to exist.
  • dateext
    - instead of making old versions .1, .2 and such, use -YYYYMMDD suffix
you can control the format with
dateformat strftimeStyle
. Note that it must be lexically sortable by time (hence yyyymmdd), since this is used to determine which files are oldest and can be deleted
this style can block rotation for a bit, in that logrotate won't overwrite rotations, so if using yyyymmdd (and no time) this won't rotate more than once per day (verify)


  • nocompress
    , e.g. for easier greppability.
  • compress
    - compress old log files.
  • delaycompress
    - don't compress the most recent backup (e.g. log, log.1, log.2.gz, log.3.gz). Makes it easier/faster to grep recent logs while still saving a bunch of disk space in the long term
  • You can also control what compression program is used


  • notifempty
    - don't rotate log files if they are 0 bytes large. If you do want to know that a certain time saw no log messages (e.g. in some automatic statistics system), you may want the opposite, ifempty
  • rotate n
    - keep no more than n rotated files (for example,
    daily
    plus
    rotate 7
    means seven days of backlog.
    • Note that particularly size-based rotation can mean less effective backlog, because it may rotate faster (and still in 7 files)
    • Not specifying rotate means old logs are not kept (meaning logrotate only really truncates logs?)(verify)
    • for eternal logs, choose a very large value. Seems to be a 32-bit signed int, so rotate 2147483647 is the largest value it'll take. (that many days or 1M logs is, well, more than you'll ever need)
  • maxage n
    - remove rotated log files once they reach an age, in days. Arguably a little redundant.


Mailing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

logs can be mailed

  • whenever they become older than maxage
  • when they are rotated
  • when they would be deleted (rotate / maxage)
mail
address to mail to
maillast
(defailt behaviour) - send logs when they will be removed (because of rotate). Can be useful to archive old logs elsewhere (...though mail is not the most reliable method of doing so).
mailfirst
- send the just-rotated file. Useful to read a recent, e.g. daily log.


Example

Note: while debugging, particularly if it doesn't seem to ever rotate anything, you can use logrotate -dv conffile -v means it'll give verbose output, -d is debug mode - won't actually change anything, just says what it would do (also won't run prerotate/postrotate)


See also

python logging

See Python notes - logging


Considerations

The arguments for logging frameworks

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


print() as logging is simple, and effective for debugging while writing the code. In-the-field debugging of the "can you send me the log file so I can figure out what's going on?" sort? Not so much. You want logging instead because with print you can't control output without editing code, you can't control the amount of detail, and it won't be saved unless you explicitly do it. In some cases you may even have hold up the app because stdout is buffered.


A well designed logging system can help answer most of the following:

  • what happened?
  • where did it happen?
  • when did it happen?
  • how important is it?
  • who wants to know?
  • what do you want to do about it?

The last two are often not very well addressed in custom/makeshift solutions, and using a logging system can mean you have a framework to do most of these, now or later, based on the configuration you hand it.


There are a few other things you might want to consider:

  • most logging systems have a separate log configuration, meaning that admins can change the logging setup without having to recompile the program.
  • A log is not analogous with a file, it is a data sink you may eventually want routed and filtered various ways.


That said, even good logging frameworks tend not to scale down to simpler cases very well. That is, they are often unnecessarily complex to set up when all you want is mostly "send stuff to a single file." This can be prohibitive to adoption.

Still, frameworks tend to look more complex than they are, because they allow various things that you'll probably never all do at once, or only in unusually complex setups.



Logging to a central log server, database, etc