Python notes - logging

From Helpful
Jump to navigation Jump to search

Syntaxish: syntax and language · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency · exceptions, warnings

IO: networking and web · filesystem

Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes

Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly


Tasky: Concurrency (threads, processes, more) · joblib · pty and pexpect

Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML

date and time


Notebooks

speed, memory, debugging, profiling · Python extensions · semi-sorted

python logging module

The idea behind the model is/allows:

  • filtering based on levels (as most logging systems do)


  • messages mentioning their source (allowing...)
  • filtering based on source (dots making for tree style)
allows e.g. different parts of your app to be filtered differently, and to control that from a central place


  • custom text formatters, e.g. for readability versus parseability


  • registration multiple targets
...in different ways if you want, e.g.
≥debug to file
≥errors to stdout
≥critical mailed to the admin



Implementationwise:

There are three central configurable objects

  • logger objects (the objects via which you send messages, optional log-level filter)
  • handlers (where messages go, optional log-level filter)
  • formatters, which turn the logged data into the thing to be shown/stored

You will tie formatters to handlers, and handlers to logger objects.

When you call a log function (like debug(), info(), warn(), error(), critical(), log()) on the logger object, it figures out all the writing/sending that (the attached handlers and formatters) implies, if any.


Logger objects also have a name, meant to be used (...in that this is how filtering will interpret it...) as a period-separated tree thing to signal the part of the app it came from, e.g. 'appname', 'appname.bgwork', etc.


Note that filtering by log level can happen both on logger objects and on handlers. (defaults to NOTSET, which means no filtering(verify))



Configuration, getLogger(), and names

basicConfig (quick and dirty setup)

A shortest way to start logging is something like:

import logging
logging.basicConfig()        # defaults to stderr, and 
logger = logging.getLogger() # get root logger. You ''can'' go even shorter, but it's confusing in introductions

logger.error("test")


basicConfig is a shorthand that:

creates a StreamHandler (for stdout)
...apparently with level=warning(verify)
with a default Formatter
adds it to the root logger


It lets you configure various aspects of relatively simple logging - consider:

logging.basicConfig(filename='/tmp/app.%d.log'%os.getpid(),
                    level=logging.DEBUG,
                    format="%(asctime)-15s:%(name)s:%(process)d:%(levelname)s:%(message)s")


Tying bits together yourself in code

You can get good control over basic logging with a few lines of code.

Consider:

import logging, logging.handlers

logger = logging.getLogger('myapp')
hdlr = logging.handlers.RotatingFileHandler( '/tmp/myapp.log', maxBytes=1000000, backupCount=5 )
hdlr.setFormatter( logging.Formatter('%(asctime)s %(levelname)s %(message)s') )
logger.addHandler(hdlr)

Or:

import logging, logging.handlers

logger = logging.getLogger('myapp')
hdlr = logging.handlers.SysLogHandler(address='/dev/log')
logger.addHandler(hdlr)
More complex configuration
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

A logger object has a name, a level filter, and can be configured to go to a specific ouput.


The simple use case above uses only one root logger (which has no name), but in larger projects, multiple logger objects often make sense, because this lets you control messages from distinct parts of your app in different ways.


You do this by requesting a logger object for a unique name. That name can then be mentioned in configuration.

It's often reasonable to use the module name itself as a name, in which case logging.getLogger(__name__) is easy to dump everywhere.


You want to configure logging before any logging happens, because...

If you log to the root logger before it has configuration,

it will
say No handlers could be found for logger "loggername"
call logging.basicConfig(), to attach a warning-or-worse to stderr handling to the root logger
this is meant for quick and dirty use

If you log to a named logger before it has configuration,

it will
say "No handlers could be found for logger "name" once
logging for such unconfigured logger objects is discarded.
Note this will also happens when libraries use logging but the application that uses such a library does not
if you want to silence this completely, you'll need a few lines of code
if you use logging in a library, it is preferable that you register a NullHandler by default
...so that users of your library don't get any errors or logging - but they can choose to enable logging


Configuration says what to actually do with events from each named logger.

Configuration itself can come from

  • code - more detailed control
  • file or dict
easy to configure externally (consider e.g. containers)
could be updated while running (by developer and admin alike)

Configuration is typically done at startup, just for practical reasons, largely because of the already-mentioned 'don't log before config'.


Note that getLogger() returns an object that is thread-safe, so you can use the same logger in multiple places without having to hand it around.

On names, hierarchy behaviour (and implictions to the config file)

Dots in your logger names imply hierarchy, in that (split into a list, matched to get hierarchy-like behaviour).


This matters to the way config refers to loggers, and (optional) routing/filtering of messages.

That is, the deepest applicable logger in the config (matching on the qualname) will be used.


Logging will also propagate up to shallower matching loggers, unless you set propagate=0.


The root logger is considered a layer above all named loggers, so is effectively a fallback logger. (When you call getlogger() or getlogger('root'), you get the root logger. But the former is a special case and does not return the same object as the latter)



While debugging, I've found it useful

  • to detail where messages come from.
If you want more than using module names as logger names, consider things like
logging.getLogger('%s.%s'%(__name__, self.__class__.__name__))        # for specific classes
logging.getLogger('%s.%s'%(__name__, sys._getframe().f_code.co_name)) # for function names
...though that does imply littering getLogger calls everywhere
  • to fetch stack trace to put into the message




Config files
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

You can hand a filename to logging.config.fileConfig(). It will be read, parsed with ConfigParser, and the function will create and set the formatters, handlers, and loggers described.


In such a config file, you mention the names of loggers, handlers, and formatters, then detail each of them. A relatively simple example:

# You mention the names of loggers, handlers and formatters you'll be defining later
[loggers]
keys=root,log2
[handlers]
keys=hand1
[formatters]
keys=form1


#then specify each them. The section names are implied from the keys you mentioned above.
[logger_root]
handlers=hand1

[logger_log2]
level=WARNING
handlers=hand1


[handler_hand1]
class=FileHandler
args=('myapp.log', 'w')
level=DEBUG
formatter=form1

[handler_hand2]
class=StreamHandler
level=DEBUG
formatter=form1
args=(sys.stderr,)


[formatter_form1]
format=%(asctime)s - %(name)s - %(levelname)s - %(message)s
datefmt=

Note: If you get the exception ConfigParser.NoSectionError: No section: 'formatters' , this probaly means you pointed it to a config file path that didn't exist.


(You can't hand it such a configuration in StringIO or text form, though you can likely easily hack that in by imitating the pretty short implementation of fileConfig())


Logging exceptions

Basically, in an Except block you can ask logging to fetch and dump the traceback:

logger.error('error handling file', exc_info=True)

And it seems like this is basically equicalent to:

logger.exception('error handling file')


Because I didn't know this I had previously done this myself:

exc_info = sys.exc_info()  # (exceptiontype, exceptionvalue, tracebackobject) tuple          
if exc_info[0]!=None:
   msg+='\n\n'+'\n'.join(traceback.format_exception(*exc_info))

I guess that's basically what it does too.

Handlers

Existing handler classes include:

  • FileHandler: Basic file appender
  • RotatingFileHandler (writes to a configurable number, of a configurable size, which rotate)
  • TimedRotatingFileHandler: Like the last, but rotates based on time and not size
  • WatchedFileHandler (≥ Py2.6): Re-opens the file when it is changed by something else. For example useful when you log to a file that may be rotated by something else.
Not on windows
  • SocketHandler: TCP logger
  • DatagramHandler: UDP logger
  • HTTPHandler: Log to some web server (GET or POST)
  • SMTPHandler: Send mail
  • MemoryHandler: Allows you to buffer many records and have them emptied into a target handler in chunks
  • SysLogHandler: unix syslog daemon (via an UDP socket, or a local filename (e.g. /dev/log))
  • NTEventLogHandler: local Win NT/2000/XP event log
  • StreamHandler: writes to file-like object -- anything with write() and flush() (e.g. sys.stderr, or your own object)
  • NullHandler: used e.g. when you want to silence a library's logging


Notes:

  • SyslogHandler
defaults to UDP to localhost. If that doesn't work, try specifying address='/dev/log'
you can use facility, but not ident (at least not before py3.3)
https://stackoverflow.com/questions/4261253/how-do-i-set-the-ident-string-when-using-logging-sysloghandler-in-python-2-6

Formatters

Formatters take a LogRecord, and do a msg % args type thing. The basic one is something like:

"%(asctime)s %(name)-19s %(levelname)-5s - %(message)s"

...which, while it doesn't display everything that can be present in a LogRecord, serves most immediate needs.

You may wish to think about parseablility, for example using an unusual divider


The members of LogRecord are mostly interesting when writing custom handlers.

A LogRecord's most interesting members:

  • name - logger name
  • levelno - numeric level
  • levelname - name of level
  • pathname - absolute pathname of the source file
  • lineno - line number in that file with the log call
  • msg - message (format string)
  • args - args from logging call. A tuple that combines with msg

Also:

  • exc_info - exception 3-tuple (or None)
  • func (optional, new since py2.5) - the function from which the logging call was made.


Fiddling with formatting
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Log calls accept additional arguments into **kwargs. The logging module looks for two keys in that:

  • exc_info, which is added to the message
  • extra - updates LogRecord's __dict__ (avoid overwriting the existing keys)


The latter is apparently meant for some minor customisation, the ability to handle extra data in your handles.

The documentation example adds keyword arguments via the formatter, which is probably only convenient when you use this widely through your app (more specifically all logging calls to the handler that does this), because the keys you use in your formatter must be handed in in the extra= dict or you'll get a KeyError. The notes on the documentation example below note that is is only for special circumstances, for example distinguishing between messages from multiple threads.

FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FORMAT)
d = {'clientip':'192.168.0.1','user':'fbloggs'}
logging.warning("Protocol problem: %s", "connection reset", extra=d)

Which yields:

2006-02-08 22:20:02,165 192.168.0.1 fbloggs  Protocol problem: connection reset


Customization

See also