Python notes - logging
Syntaxish: syntax and language · type stuff · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency · exceptions, warnings
IO: networking and web · filesystem Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly
Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML speed, memory, debugging, profiling · Python extensions · semi-sorted |
python logging module
The idea behind the model is/allows:
- filtering based on levels (as most logging systems do)
- messages mentioning their source (allowing...)
- filtering based on source (dots making for tree style)
- allows e.g. different parts of your app to be filtered differently, and to control that from a central place
- custom text formatters, e.g. for readability versus parseability
- registration multiple targets
- ...in different ways if you want, e.g.
- ≥debug to file
- ≥errors to stdout
- ≥critical mailed to the admin
Implementationwise:
There are three central configurable objects
- logger objects (the objects via which you send messages, optional log-level filter)
- handlers (where messages go, optional log-level filter)
- formatters, which turn the logged data into the thing to be shown/stored
You will tie formatters to handlers, and handlers to logger objects.
When you call a log function (like debug(), info(), warn(), error(), critical(), log()) on the logger object, it figures out all the writing/sending that (the attached handlers and formatters) implies, if any.
Logger objects also have a name, meant to be used (...in that this is how filtering will interpret it...) as a period-separated tree thing to signal the part of the app it came from, e.g. 'appname', 'appname.bgwork', etc.
Note that filtering by log level can happen both on logger objects and on handlers.
(defaults to NOTSET, which means no filtering(verify))
Configuration, getLogger(), and names
basicConfig (quick and dirty setup)
A shortest way to start logging is something like:
import logging
logging.basicConfig() # defaults to stderr, and
logger = logging.getLogger() # get root logger. You ''can'' go even shorter, but it's confusing in introductions
logger.error("test")
basicConfig is a shorthand that:
- creates a StreamHandler (for stdout)
- ...apparently with level=warning(verify)
- with a default Formatter
- adds it to the root logger
It lets you configure various aspects of relatively simple logging - consider:
logging.basicConfig(filename='/tmp/app.%d.log'%os.getpid(),
level=logging.DEBUG,
format="%(asctime)-15s:%(name)s:%(process)d:%(levelname)s:%(message)s")
Tying bits together yourself in code
You can get good control over basic logging with a few lines of code.
Consider:
import logging, logging.handlers
logger = logging.getLogger('myapp')
hdlr = logging.handlers.RotatingFileHandler( '/tmp/myapp.log', maxBytes=1000000, backupCount=5 )
hdlr.setFormatter( logging.Formatter('%(asctime)s %(levelname)s %(message)s') )
logger.addHandler(hdlr)
Or:
import logging, logging.handlers
logger = logging.getLogger('myapp')
hdlr = logging.handlers.SysLogHandler(address='/dev/log')
logger.addHandler(hdlr)
More complex configuration
A logger object has a name, a level filter, and can be configured to go to a specific ouput.
The simple use case above uses only one root logger (which has no name),
but in larger projects, multiple logger objects often make sense,
because this lets you control messages from distinct parts of your app in different ways.
You do this by requesting a logger object for a unique name.
That name can then be mentioned in configuration.
It's often reasonable to use the module name itself as a name, in which case logging.getLogger(__name__) is easy to dump everywhere.
You want to configure logging before any logging happens, because...
If you log to the root logger before it has configuration,
- it will
- say No handlers could be found for logger "loggername"
- call logging.basicConfig(), to attach a warning-or-worse to stderr handling to the root logger
- this is meant for quick and dirty use
If you log to a named logger before it has configuration,
- it will
- say "No handlers could be found for logger "name" once
- logging for such unconfigured logger objects is discarded.
- Note this will also happens when libraries use logging but the application that uses such a library does not
- if you want to silence this completely, you'll need a few lines of code
- if you use logging in a library, it is preferable that you register a NullHandler by default
- ...so that users of your library don't get any errors or logging - but they can choose to enable logging
Configuration says what to actually do with events from each named logger.
Configuration itself can come from
- code - more detailed control
- file or dict
- easy to configure externally (consider e.g. containers)
- could be updated while running (by developer and admin alike)
Configuration is typically done at startup, just for practical reasons, largely because of the already-mentioned 'don't log before config'.
Note that getLogger() returns an object that is thread-safe, so you can use the same logger in multiple places without having to hand it around.
On names, hierarchy behaviour (and implictions to the config file)
Dots in your logger names imply hierarchy, in that (split into a list, matched to get hierarchy-like behaviour).
This matters to the way config refers to loggers, and (optional) routing/filtering of messages.
That is, the deepest applicable logger in the config (matching on the qualname) will be used.
Logging will also propagate up to shallower matching loggers, unless you set propagate=0.
The root logger is considered a layer above all named loggers, so is effectively a fallback logger. (When you call getlogger() or getlogger('root'), you get the root logger. But the former is a special case and does not return the same object as the latter)
While debugging, I've found it useful
- to detail where messages come from.
- If you want more than using module names as logger names, consider things like
logging.getLogger('%s.%s'%(__name__, self.__class__.__name__)) # for specific classes logging.getLogger('%s.%s'%(__name__, sys._getframe().f_code.co_name)) # for function names
- ...though that does imply littering getLogger calls everywhere
- to fetch stack trace to put into the message
Config files
You can hand a filename to logging.config.fileConfig(). It will be read, parsed with ConfigParser, and the function will create and set the formatters, handlers, and loggers described.
In such a config file, you mention the names of loggers, handlers, and formatters, then detail each of them. A relatively simple example:
# You mention the names of loggers, handlers and formatters you'll be defining later
[loggers]
keys=root,log2
[handlers]
keys=hand1
[formatters]
keys=form1
#then specify each them. The section names are implied from the keys you mentioned above.
[logger_root]
handlers=hand1
[logger_log2]
level=WARNING
handlers=hand1
[handler_hand1]
class=FileHandler
args=('myapp.log', 'w')
level=DEBUG
formatter=form1
[handler_hand2]
class=StreamHandler
level=DEBUG
formatter=form1
args=(sys.stderr,)
[formatter_form1]
format=%(asctime)s - %(name)s - %(levelname)s - %(message)s
datefmt=
Note: If you get the exception ConfigParser.NoSectionError: No section: 'formatters' , this probaly means you pointed it to a config file path that didn't exist.
(You can't hand it such a configuration in StringIO or text form, though you can likely easily hack that in by imitating the pretty short implementation of fileConfig())
Logging exceptions
Basically, in an Except block you can ask logging to fetch and dump the traceback:
logger.error('error handling file', exc_info=True)
And it seems like this is basically equicalent to:
logger.exception('error handling file')
Because I didn't know this I had previously done this myself:
exc_info = sys.exc_info() # (exceptiontype, exceptionvalue, tracebackobject) tuple if exc_info[0]!=None: msg+='\n\n'+'\n'.join(traceback.format_exception(*exc_info))
I guess that's basically what it does too.
Handlers
Existing handler classes include:
- FileHandler: Basic file appender
- RotatingFileHandler (writes to a configurable number, of a configurable size, which rotate)
- TimedRotatingFileHandler: Like the last, but rotates based on time and not size
- WatchedFileHandler (≥ Py2.6): Re-opens the file when it is changed by something else. For example useful when you log to a file that may be rotated by something else.
- Not on windows
- SocketHandler: TCP logger
- DatagramHandler: UDP logger
- HTTPHandler: Log to some web server (GET or POST)
- SMTPHandler: Send mail
- MemoryHandler: Allows you to buffer many records and have them emptied into a target handler in chunks
- SysLogHandler: unix syslog daemon (via an UDP socket, or a local filename (e.g. /dev/log))
- NTEventLogHandler: local Win NT/2000/XP event log
- StreamHandler: writes to file-like object -- anything with write() and flush() (e.g. sys.stderr, or your own object)
- NullHandler: used e.g. when you want to silence a library's logging
Notes:
- SyslogHandler
- defaults to UDP to localhost. If that doesn't work, try specifying address='/dev/log'
- you can use facility, but not ident (at least not before py3.3)
- https://stackoverflow.com/questions/4261253/how-do-i-set-the-ident-string-when-using-logging-sysloghandler-in-python-2-6
Formatters
Formatters take a LogRecord, and do a msg % args type thing. The basic one is something like:
"%(asctime)s %(name)-19s %(levelname)-5s - %(message)s"
...which, while it doesn't display everything that can be present in a LogRecord, serves most immediate needs.
You may wish to think about parseablility, for example using an unusual divider
The members of LogRecord are mostly interesting when writing custom handlers.
A LogRecord's most interesting members:
- name - logger name
- levelno - numeric level
- levelname - name of level
- pathname - absolute pathname of the source file
- lineno - line number in that file with the log call
- msg - message (format string)
- args - args from logging call. A tuple that combines with msg
Also:
- exc_info - exception 3-tuple (or None)
- func (optional, new since py2.5) - the function from which the logging call was made.
Fiddling with formatting
Log calls accept additional arguments into **kwargs. The logging module looks for two keys in that:
- exc_info, which is added to the message
- extra - updates LogRecord's __dict__ (avoid overwriting the existing keys)
The latter is apparently meant for some minor customisation, the ability to handle extra data in your handles.
The documentation example adds keyword arguments via the formatter, which is probably only convenient when you use this widely through your app (more specifically all logging calls to the handler that does this), because the keys you use in your formatter must be handed in in the extra= dict or you'll get a KeyError. The notes on the documentation example below note that is is only for special circumstances, for example distinguishing between messages from multiple threads.
FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FORMAT)
d = {'clientip':'192.168.0.1','user':'fbloggs'}
logging.warning("Protocol problem: %s", "connection reset", extra=d)
Which yields:
2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset