Python decorator notes

From Helpful
Revision as of 12:12, 17 May 2024 by Helpful (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Syntaxish: syntax and language · type stuff · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency · exceptions, warnings


IO: networking and web · filesystem

Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes

Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly


Tasky: Concurrency (threads, processes, more) · joblib · pty and pexpect

Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML

date and time


Notebooks

speed, memory, debugging, profiling · Python extensions · semi-sorted

Attempted introduction

In python, decorators are syntactic sugar with a specific purpose: do something to the function it is decorating.


What?

For a good part, the idea is that:

@decoratorfunctionname
def foo():
    pass

...is syntactic sugar for:

def foo():
    pass
foo = decoratorfunctionname(foo)


Why that is useful varies.

How clean any one use is also varies.


More technically, a decorator is a function that returns a function.


Since decorators sit at the same scope as the function itself, they apply at the time that function definition is evaluated.

So if global, just once.



Why?

The point is often to do something useful to that function, or around that function.

And, at the same time, still be able to call it the same thing afterwards (the above makes a new function called foo from an old function called foo, and most of the time, that old function stays in play)


Again, why?

Because they are evaluated once at the same scope of definition, a common use is to wrap in some extra logic to a select few functions (without e.g. using OO to do so), at little execution cost.



Again, why?

Try some examples:

Examples

Decoration time only

The most minimal effect is only altering the function object.


The following is taken from CherryPy:

def expose(f):
   f.exposed = True
   return f   #return the same function object we got


#Used like:
@expose
def foobar():
   pass

All this does is take foobar function and sets an, treat it as an object, and sets an attribute on it.

In this case it is equivalent to defining the function and manually doing foobar.exposed = True at the same scope.

(and there is an argument to be made that doing that is a lot more readable for everyone, not just those not versed in this particular platform or good at traversing docs, because the docstring isn't immediately helpful either. The actual use here is that CherryPy can check for that attribute later, based on a simple test, and you can mark them with a short bit of code. 'Exposed' happens to mean "this is allowed to act as a request handler". ...I think.)


Upsides:

  • it's brief, and usually fairly clear what the decorator does, by its name
  • it adds no overhead to each use (as some other uses of decorators do)

Limitations:

  • It's code that isn't code -- you have no idea what that decorator does (or any) until you find its implementation

Wrapper function

More commonly, the decorator returns a new function, which adds some logic, but mostly just calls the function it originally got.


For example:

def decorator(decoratedfunc):
    def wrapper_function():
        print( "before the call" )
        wrappedretval=decoratedfunc()
        print( "after the call" )
    return wrapper_function

@decorator
def do_nothing():
   pass

do_nothing()
do_nothing()

Now those prints happen around every call to do_nothing(), because decoration actually changed the function that do_noting is bound to - it's now that wrapper_function.

Note that in this case, the decorator applies only to functions that take no arguments. This is often not very useful.

You'ld probably use *args and **kwargs to fix that; more on that later.


So a more useful variant may be:

import time

def timer(decoratedfunc):
    ' timer decorator '
    def timer_wrapper(*args, **kwargs):
        ' the function that wraps around the real call '
        start = time.time()
        # Call the original function, save its return value
        wrappedretval = decoratedfunc( *args,**kwargs ) 
        end = time.time()
        print( "Call to %s() took %.4f seconds"%(decoratedfunc.__name__, end-start) )
        return wrappedretval
    return timer_wrapper   #return the wrapper function we just defined. 


@timer
def test():
    for x in range(1000000):
        pass
    #time.sleep(1.3)
    return "Done"

test()

Notes:

  • decoration mostly just applies test=timer_wrapper(test) (plus it technically uses a closure construction to work).
  • the inner function's name is more or less a dummy.
  • You may want your wrapper functions to accept *args, **kwargs so that it can decorate any function without having to copy their prototype


Upsides:

  • can wrap

Downsides:

  • more calls. Sometimes that's the point, but sometimes that would be entirely avoidable

Other ideas

  • log a function's arguments
  • make things synchronize on a lock
  • introduce stricter run-time typechecking on arguments
  • wrap a call in one that creates, then cleans, a temporary directory. (the sort of code you want to write once, write well, and then black-box away)
  • 'create a database connection if this function was not handed one, and close it afterwards'
  • object members, e.g. to test for their presence, value, or whatnot
  • add docstring-esque metadata like author and version


Built-in decorators

@property

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Consider setting a value on an object.

obj.val = 4

Consider wanting some encapsulation, and possibly some checking code of values going in and out.

You might make functions like:

class Thing:
    def __init__(self, val=None):
        self._val = val

    def set_val(self, val):
        print("set val")
        self._val = val

    def get_val(self):
        print("get val")
        return self._val

obj = Thing()
obj.set_val(4) 
obj.get_val()

This means the way in and out is now via code - in this example doing nothing, in real cases probably adding checks.

...but it also makes for more keystrokes every time you want to use it (and there is some temptation to just access _val directly). (side note: doing it with a class isn't the only way, but mostly avoids a few more keystrokes on use)

Turns out you can get something that looks like a value but is actually going through functions. There are various ways of doing and writing this. One of them looks like:

class Thing:
    def __init__(self, val=None):
        self._val = val
 
    @property
    def val(self):
        print("get val")
        return self._val
 
    @val.setter
    def val(self, val):
        print("set val")
        self._val = val

obj = Thing()
obj.val = 5 
obj.val


That's the same function implementations under the covers as the previous section -- but it now looks like you're just altering an attribute.


That syntax is a little magical, yes.

There are cleaner-but-longer equivalents.


Why do this? / why not do this?

Most tutorials out there give toy examples, without any reasoning why or when this is a good or bad idea in the real world.


Upsides:

  • you can add checks
  • Yes, if you have a lot of code code that assumes it's just a variable, then you can make a drop-in.
but if you're changing the interface/behaviour of something so shared or central, you should be thinking a little harder


Limitations

  • you are breaking a basic abstraction, and is "I don't like pressing autocomplete in my editor" really a good enough reason for that?
  • it is less transparent to the person maintaining this code whether this is just a dereferenced attribute, or a complete ghost implementation in varied functions
and if they don't stick to basic logic within their own instance, all bets are off

@staticmethod, @classmethod

For context,

  • Functions defined on a class default to and are usually instance methods, meant to work on an instances of that same class
the definition of that instance function must accept an instance as the first argument (while some other languages have it implied, in pyhton you write it out, which can be a little clearer when you laso have the following two around)
(uses of an instance function means python figures out how to hand the instance in that way)


That is not the only thing you ever want to do with a function - the mentioned are two more

  • static methods are a function stuck on a class, but do not take an instance
nor can they access the class definition directly
these are often to stick some utility functions on an class rather than just near it in a module
In particular when these utility functions are only relevant to that one class, and e.g. whether the arguments would be instances of this class, or whatever else, is entirely up to you.
It's organization more than structuring
Just how necessary that is to cleanliness may depend on how much you split out your modules


  • class methods are defined on a class, and do not take an instance, but do get a reference to that class, meaning they can fetch class variables (note: not instance variables)
...and could potentially alter that class
...but more frequently seem used for things like factories?(verify)
from the context of instqance variables, self.__class__ often lets you cheat you way out of needing these (but it's not as clean)
Not used much unless you do some meta-modelling, or need a reference to the class but not an instance.



See also:



Less visible decoration

Since the very point is that you return a different function, this function will have its own context, name, docstring and such -- so won't report back as nicely when you do help() or generate documentation.

To make the timer wrapper above more transparent that way, you might make the last few lines of:

  timer_wrapper.__name__   = decoratedfunc.__name__
  timer_wrapper.__doc__    = decoratedfunc.__doc__
  timer_wrapper.__dict__   = decoratedfunc.__dict__
  timer_wrapper.__module__ = decoratedfunc.__module__
  return timer_wrapper

Since python 2.5, functools has a (meta-)decorator to make this easier

Syntax details / examples

Arbitrary arguments

Using *args and/or **kwargs is a general way to accept as well as apply positional arguments (accepted as a list) and keyword arguments (accepted as a dict), without an actual function prototype.

To the decorated function

You can also call a function with *args and **kwargs (note: without the asterisks they would just be positional arguments), which tells Python what it usually does internally to make arguments to a function work.

def dec(f):
   def wrapper(*args,**kwargs):
        print `args,kwargs`
 	ret = f(*args,**kwargs)
	return ret
   return wrapper

@dec
def foo(a,b='yay'):
   print "foo(%s,%s)"%(a,b) 
  

foo('gir')

foo('gir','monkey')

foo('gir',b='monkey') #Note: not the same as the last, where monkey was positional

This tends to be useful when you want to wrap arbitrary functions, not just those with a particular prototype.


You can manipulate kwargs (and args) if you want to, but note that it is fairly easy to cause problems, such as

  • argument amount mismatches, and "...got an unexpected keyword argument..." errors if the function asn't expecting some keyword.
  • python lets you hand in the same thing with a keyword and positional argument.
that means if you add something in a keyword which was already addressed positionally, this is now a conflict (it will complain it "got multiple values for keyword argument")
  • decorator injection of *args,**kwargs are mostly invisible, so screw up the transpacency of the decorator, and the ease of understanding during debugging

To the decorator

Code taken from the PEP:

def attrs(**kwds):                   # takes only keyword arguments
    def decorate(f):
        for k in kwds:
            setattr(f, k, kwds[k])   # and sets those as attributes
        return f
    return decorate

@attrs(versionadded="2.2",
       author="Guido van Rossum")
def mymethod(f):
    pass


Multiple decorators

You can chain decorators. There is no special treatment, it just works out as wrapped calls, ealier being outer.

Example:

@timer
@synchronized(default_lock)
def test():
    print 'foo'


Docstrings

If you use the inspect and/or pydoc modules to generate help for your modules, you may notice that decorated functions don't show up.

One solution would be to add:

setattr(wrapperfunc,'__doc__', getattr(origfunc,'__doc__'))

...to each decorator. However, doc generation will still use the signature of the decorator, which is particularly useless if you use *args,**kwargs style decorators.


My own solution is to setattr the whole original function instead of __doc__ specifically, onto some name (e.g., '_helpfunc') on the wrapping/decorator function, and to slightly customize the help generation to replace the function-to-be-explained with the one from that attribute, if that member exists, i.e. adding:

_helpfunc = getattr(func,'__doc__',None)
if _helpfunc:
    func = _helpfunc

This should be fairly simple to inject. If you use the built-in pydoc, and then the HTML document generator which takes a module, you may wish to replace that function; it's only a fairly think wrapper around other pydoc/inspect functions in the first place, so you can somewhat customize your help formatting at the same time. (TODO: give example)

Experiments

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Some examples (my own experiments and mostly untested, so they may be bad):

See the timer above.


Decorators that take arguments are somewhat different. I need to explain these, but for now look at the following example:

from threading import Lock
default_lock = Lock()

def synchronized(lock=default_lock):
    ''' Decorator to do java-esque synchronization 
        (except you synch on a lock instead of an arbitrary object) 
        I should check whether this is even correct.

       The default argument is probably practical, though in a 
       need-to-know-what-it-means sense. You may want to force someone
       to create locks and hand them along explicitly.
    '''
    def decorator(func): # a temporary/dummy name, you return the function
        def wrapper(*args,**kwargs):
            lock.acquire()
            try:
                ret=func(*args, **kwargs)
            finally:
                lock.release()
            return ret
        return wrapper
    return decorator

#You would use it like:
@synchronized()
def test():
    print 'foo'</nowiki>