Python notes - syntax and language

From Helpful
Jump to: navigation, search
Syntaxish: syntax and language · changes and py2/3 · decorators · importing, modules, packages · iterable stuff · concurrency

IO: networking and web · filesystem

Data: Numpy, scipy · pandas, dask · struct, buffer, array, bytes, memoryview · Python database notes

Image, Visualization: PIL · Matplotlib, pylab · seaborn · bokeh · plotly

Tasky: Concurrency (threads, processes, more) · joblib · pty and pexpect

Stringy: strings, unicode, encodings · regexp · command line argument parsing · XML

date and time


speed, memory, debugging, profiling · Python extensions · semi-sorted

On language, setup, environment

(Major) Python implementations

CPython is the C implementation of Python, is the usualy implementation used as a system Python, and is also the reference implemenation of Python.

Jython implements python in Java. Apparently it is only slightly slower than CPython, and it brings in the java standard library to be used from python code, though lose C extensions(verify).

IronPython compiles python to IL, to run it on the .NET VM. It performs similarly to CPython (some things are slower, a few things faster, even) but like any other .NET language, you get .NET interaction.

You lose the direct use of C extensions (unless you have fun with C++/CLI), though .NET itself often has some other library to the same effect.

Python for .NET is different from IronPython in that it does not produde IL or run on the .NET VM, but is actually a managed C interface to CPython(verify) (which also seems to work on Mono).

While somewhat hairier than IronPython, it means you can continue to use C extensions, as well as interact with .NET libraries; the .NET library can be directly imported, and you can load assemblies.

There is also PyPy [1] [2], which is an implementation of python in python. It seems this was originally for language hacking and such (since it's easier to implement mucking with Python rather than in C), but it seems to now be a good JIT compiler (relying for a good part on RPython, a subset of Python that can be statically compiled) that can give speed improvements similar to the now-aging psyco.

Help / documentation

An pre-code(verify) unassigned string at module, class or function level is interpreted as a docstring (stored in its __doc__ attribute).

Docstrings will show up in documentation that can be automatically generated based on just about anything. For example:

>>> class a:
...   "useless class"
...   def b():
...     "method b does nothing"
...     pass
>>> help(a)
Help on class a in module __main__:
class a
 |  useless class
 |  Methods defined here:
 |  b()
 |      method b does nothing

This works on most builtins and system modeules:

>>> help(id)
Help on built-in function id in module __builtin__:
    id(object) -> integer
    Return the identity of an object.  This is guaranteed to be unique among
    simultaneously existing objects.  (Hint: it's the object's memory address.)

...sometimes providing nice overviews. For example, help(re) includes:

compile(pattern, flags=0)
    Compile a regular expression pattern, returning a pattern object.
    Escape all non-alphanumeric characters in pattern.
findall(pattern, string, flags=0)
    Return a list of all non-overlapping matches in the string.
    If one or more groups are present in the pattern, return a
    list of groups; this will be a list of tuples if the pattern
    has more than one group.
    Empty matches are included in the result.
finditer(pattern, string, flags=0)
    Return an iterator over all non-overlapping matches in the
    string.  For each match, the iterator returns a match object.

There are automatic documentation generators, including:

  • epydoc (HTML, result looks like [5], rather like the Java API docs)
  • Docutils (HTML, LaTeX, more?)
  • HappyDoc (HTML, XML, SGML, PDF)
  • a filter for doxygen (not as clever)
  • ROBODoc?
  • TwinText?
  • Natural Docs?


It seems best to use only spaces, and never tabs - mixing the two often leads to cases that may look right but be wrong, or the other way around.

Set any non/semi-python-aware editors to insert spaces for tabs.

If you don't have personal preferences, you may want to choose for 4-space indents, if only because it's common and so copying in code is simpler.

In emacs, you can do these two things be changing your configuration to include:

(setq-default indent-tabs-mode nil)
(setq-default py-indent-offset 4)
; or even a default for all modes, not just python:
(setq-default tab-width 4)
You can make emacs convert tabs to spaces (in the current region) using
M-x untabify

For large-scale reindenting, you might be interested in something like


You'll see this word where you might expect 'function'.

Because you can call a function, method, class (or, technically, type),

More specifically, any instance with a __call__ method.

In many situations where you could pass a function, you can pass any callable, because most of the time all the backing code does is call the object.

To test whether something can be called, you could use callable() (a built-in).

If you wish to test for more specific cases (callable class? function? method?), you can use the inspect module (see its help() for more details than some html documentation out there seems to give).

singularity on top of immutability

Identity is compared with
which uses the built-in

Some things in python are singular (on top of being immutable), by design. You could say this messes with the identity abstraction, but is primarily used to make life simpler, and generally does.

For example, you can test against types and None as if they are values, meaning you can use either is or == without having one of them mean something subtly but fatal-buggilly different. In practice this seems better than having to know all the peculiarities of the typing system (if only because we tend to have to know several language's).

Numbers are immutable and singular.

Strings are immutable but not singular -- although there are cases where they seem to act that way, for example in string literals (are there further details?(verify)). For example:

a = 'foo'
b = 'foot'[:3]
c = 'foo'
assert(a is c)
assert(a is not b)

Calling superclass methods, super()

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

Firstly, the standard remark: if you're making inheritance diamonds, this is complexity any way you twist it, so is it necessary rather than happy-go-lucky class modelling?

If you use super(), it should be used consistently, when you know the potential problems and can explain to other people why it won't fail. (Read the two things linked to, or something like it)

Many argue that it's more understandable and less error-prone to handle superclass calls explicitly. Since superclassing is effectively part of a class's external interface anyway (and so is super, if you use it), you might as well be explicit, rather than have it be hidden by implied semantics.

While more verbose than super(), it's easier to follow, less magical, but less fragile for later changes, and mistakes are probably easier to spot by now not coming from implicit behaviour. (you can argue about the fragility - yes, it will cause errors quickly when you change arguments, but that's arguably preferable over the alternative)

One assumption here is that class inheritance is used for eliminating redundancy in your own codebase, not for flexibility.

However, when writing things like mixins (or abstract classes or interfaces), you may still need to know all about super()

Example of explicit superclass calls:

class Bee(object):
    def __init__(self):
        print( "<Bee/>" )
class SpecialBee(Bee):
    def __init__(self):
         print( "<SpecialBee>" )
         print( "</SpecialBee>" )
class VerySpecialBee(SpecialBee):
    def __init__(self):
         print( "<VerySpecialBee>" )
         print( "</VerySpecialBee>" )

This goes for any method (the constructor isn't really a special case), but it's a common example of why arguments may get in the way of super() being particularly useful.

See also:

'call at python exit'

Use the atexit module.

Avoid assigning a callable to sys.exitfunc yourself, since you may be effectively removing something already set there (you could make it a function that also calls what the function was previously set to, but there are sometimes hairy details to that, like how you deal with exceptions(verify))

Note that there is never a hard guarantee that this code will get run, considering things like segfaults - which Python itself should be pretty safe from, but isn't too hard to create in a C extension.


Built-ins are things accessible without specific imports. The following are the 2.4 built-ins, a mix of types and functions, roughly grouped by purpose.

  • dir, help

  • str, unicode (and their virtual superclass, basestring)
  • oct, hex, ord, chr, unichr
  • int, long, float, complex,
  • abs, round, divmod, min, max, pow

  • tuple, list
  • len, sum
  • filter, reduce, map, apply
  • zip
  • iter, enumerate
  • reversed, sorted
  • cmp
  • range, xrange ()
  • dict, intern
  • set, frozenset
  • bool
  • coerce
  • slice (used only for extended slicing - e.g. [10:0:-2])
  • buffer (a hackish type convenient to CPython extensions and some IO)

  • object
  • hash, id
  • repr Should give a description useful in debugging: a short, unambiguous description of the object that lets you distinguish it from others, gives some more information if you can)
  • __str__: supports str(). Less formal. May be the same thing as repr.

  • hasattr, getattr, delattr
  • type, isinstance, issubclass ((variations in simple comparison / subclass test)(verify)
type() is str
is functionally the same
isinstance(, str)
but isinstance is a little more flexible in that it lets you deal with subclassed cases
  • staticmethod, classmethod
  • super
  • property

  • exception

  • callable
  • locals, globals
  • vars
  • eval, compile
  • execfile
  • __import__: the function that the import statement uses
  • reload

  • file, open
  • input, raw_input

Shallow and deep copy

General shallow/deep copies are possible (on top of the basic reference assignment).

The following demonstration uses lists as a container, but this also applies to objects. (This does not summarize real objects and mutable structures like lists, since they themself contain references, so the concept of copying such objects is ambiguous, which is why there is a distinction in shallow and deep copying.)

>>> from copy import copy     #shallow copy - but note there are easier ways for lists
>>> from copy import deepcopy
>>> a = [object(),object()]     #original list
>>> b = copy(a)
>>> c = deepcopy(a)
>>> a
[<object object at 0xb7d21448>, <object object at 0xb7d21468>]
>>> b
[<object object at 0xb7d21448>, <object object at 0xb7d21468>]
>>> c
[<object object at 0xb7d21450>, <object object at 0xb7d21458>]

The shallow copy, b, is a new list object (id(a)!=id(b)), into which references to the objects the old collection are inserted.

The deep copy, c, is a new list object but also creates copies of the contained objects to insert into that new container.

With objects, or structures that contain objects, what you often mean to do is making a deep copy.

Note that this creation only works when the creation of these objects does not have peculiar side effects or rely on administration data or object references that it wouldn't be used the same way in deep copying.

Such issues limits deep copy in any language. There are usually partial fixes, often in the form of some way to optionally override deep-copy behaviour with your own functionality via an interface. Note that python's deepcopy does avoid circular recursion problems.

String stuff

String formatting

% with a tuple

Those coming from C will probably appreciate the % operator, which

acts like sprintf() and
mostly matches its C_and_C++_notes_/_Types,_values,_some_basic_libraries#format_strings (omits p (there are no pointers), adds r for repr())


"%d %5.1f"%(1,2)   ==  '1   2.0'

% with a mapping

If you pass it a mapping (dict or similar) you can access them by name rather than position:

"%(s)s  %(foo)07o %(bar)5.1f"%{   's':'yay', 'foo':1, 'bar':2   }       == 'yay  0000001   2.0'


format() seems to understand...

  • positional and name arguments
  • the same conversion specifiers (the type letter, which effectively defaults to s)

but does everything else in a more flexible style.

There is a decent introduction in

Some examples:

{} enumerates positionally by default, so e.g.
'{} {}'.format( 4,8 ) == '4 8'
You can explicitly index
'{1} {0}'.format(4,8) == '8 4'
You can use named indexes
'{foo} {bar}'.format(foo=1,bar=2)
Which you can use with dicts like
data = {'foo':1, 'bar':2}
'{foo} {bar}'.format( **data )
also consider the ability to do:
'{data[foo]} {data[bar]}'.format( data={'foo':1, 'bar':2} )
understands strftime style datetime formatting

'{:%Y-%m-%d %H:%M}'.format(datetime(2001, 2, 3, 4, 5))

You can pass in parameters into the formatting, by nesting style: (this would be nasty and confusing to do with %)
'{:{width}.{prec}}'.format( 3.14159265, width=10, prec=3)

Note that format() also effectively allows you to make formatting functions, like:

make_link = '<a href="{url}">{url}</a>'.format that you can later do


(this particular example doesn't escape the URL properly, but you get the idea)

f-string formatting

PEP 498 (implemented since py3.6) adds f-string formatting, which amounts to format() style but lets you use runtime eval (so basically current scope) for the names instead.

This often makes it less typing and/or shorter and/or clearer.


foo = 1
bar = 2
print( f"{foo:10s} {bar}" )

Type stuff

Type annotation

Around Python 3.5 and 3.6, we got a syntax and a helping module to annotate variables as described in PEP 526 (though a bunch more PEPs are relevant)

It looks like

def greeting(name: str) -> str:
    return 'Hello ' + name

You can also type variables, like

i:int = 1

In practice, this is type annotation, not type checking - it has absolutely no effect at runtime.

It's basically a comment, but one more parseable by IDEs, so that they can show it to programmers.

which is useful, though without checks it might actually be lying

And yes, MyPy is a tool to check code, but this is effectively only really used for tests, to check for relatively static errors.

It is dangerous to consider this type checking.

Even if you are using mypy, there are a number of things you can do at runtime that mypy cannot check - fundamentally.

If you want the safety of a statically typed language, use a statically typed language.

typing module

ctypes module

Functional-like things

Note on closures

See Closures for the basic concept. They were apparently introduced in python around 2.2.

Note that python closures are read-only (immutable), which means that trying to assign to a closured variable will actually mean creating a local variable with the same name. This works on function level, so even:

def f():
   def g():
      y=x   #...with the idea that this would create a local y
      x=y   #   and this a local x...
      print( x )

...won't work; function variables are declared at function compilation time, so it means "declare x in g's scope", but then you try to use it before you assign something to it.

Lambda, map, filter, reduce

Lambda expressions are functions that take the form
lambda args:expression
, for example
lambda x: 2*x
. They must be single expressions and therefore cannot contain newlines and therefore no complex code (unless they call functions).

They can be useful in combination with e.g. map(), filter() and such. For example:

Map gives a new list that that comes from applying a function to every element of an input list/iterable. For example:

>>> map( lambda x:4*x,  ['a',3] )
['aaaa', 12]

Filter creates a new list whose elements for which the function returns true. For example:

>>> filter( lambda x: x%7==0 and x%2==1,  range(100) ) #odd multiples of 7 under 100
[7, 21, 35, 49, 63, 77, 91]
>>> filter(lambda x: len(x[1])>0, [[1,'one'],[2,''],[3,'three']] )
[[1, 'one'], [3, 'three']]

Reduce does a nested bracket operation (e.g. ((((1+2)+3)+4)+5) when using the + function) on a list to reduce it to one value. For example:

>>> reduce(max, [1,2,3,4,5])                  #'course, max([1,2,3,4,5]) is shorter
>>> reduce( lambda x,y: str(x)+str(y), range(12) )     #(a slow way of constructing this string, actually)

Generators: yield and next

Generators are functions that keep their state (basically coroutines), and yield things one at a time, usually generated/processed on-the-fly.

For example, if you have

def twopowers():
    n = 1
    while True:
        yield n
        n *= 2

Then you can at any time e.g. ask for the next ten, evaluated exactly when you ask for them, and there's not really a point this ends

>>> list( next(t)  for _ in range(10) )
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
>>> list( next(t)  for _ in range(10) )
[1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288]
>>> list( next(t)  for _ in range(10) )
[1048576, 2097152, 4194304, 8388608, 16777216, 33554432, 67108864, 134217728, 268435456, 536870912]

In terms of language theory, the construct this uses is known as a continuation. It makes streaming data and other forms of lazy evaluation easy, and encourages a more functional programming style and a more data-streaming one.

Generator expressions are a syntax that look like list comprehensions, but create and return a generator.

gen =  (x  for x in range(101))
next(gen) # 1
next(gen) # 2
next(gen) # 3

...and so on.

Many are finite, but they don't have to be.

This is often done for lazy evaluation, and they can be a memory/CPU tradeoff, in that at the cost of a little more overhead you never have to store a full list.

Consider, for example (use range() in py3):

max(     [x  for x in xrange(10000000)] )  # Memory use spikes by ~150MB
max( list(x  for x in xrange(10000000)) )  #  (...because it is a shorthand for this)
max(     (x  for x in xrange(10000000)) )  # Uses no perceptible memory (inner expression is a generator)
max(      x  for x in xrange(10000000)  )  # Uses no perceptible memory (inner expression is a generator)


  • Sometimes this makes for more brevity/readability (though I've seen a bunch of syntax-fu that isn't necessarily either).
  • the last illustrates that in various places where the brackets aren't necessary (omitting them is unambiguous)
  • In python2, you often wanted xrange, a generator-based version, where range() returned a list.
In python3 range() is a generator-like object so the distinction no longer exists.
  • Don't use the profiler to evaluate xrange vs. range; it adds overhead to each function call, of which there will be ten million with the generator and only a few when using [] / list(). This function call is cheap in regular use, but not in a profiler.
  • when iterating over data, enumerate is often clearer (and less typing) than range


In general, iterators allow walking through iterables with minimal state, usually implemented by an index into a list or a hashmap's keys, or possibly a pointer if implemented in a lower-level language.

In python, most collection types are iterable on request. Iterating dicts implies their keys. I imagine this is based on __iter__.


  • Iterations won't take it kindly when you change the data you are iterating, so something like:
a={1:1, 2:2, 3:3}
for e in a:

...won't work. The usual solution is to build a list of things to delete and doing that after we're done iterating.

Syntax things


More things to do with lists...

range, xrange

>>> range(4)
>>> range(2,4)
>>> range(4,2,-1)

in py2, range()eturns a list and xrange is the generator-like equivalent you would prefer when giving it very large values. In py3, range() is generator based.


Slicing retrieves elements from lists very similar to how range would generate their indices.

>>> a=['zero','one','two','three','four','five']
>>> a[4:]
>>> a[:4]
>>> a[3:5]
>>> a[4:0:-2]  # I rarely find uses for this, but it exists
['four', 'two']



This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

defaultdict is like a dict, but on access (not just assignment) to things that do not exist, creates an entry.

For which instantiation needs a parameter, which settles the type and value to start with

Consider wanting to count values in a dict, e.g.

You might write

for thing in vals:
    if thing not in counter:
        counter[thing] = 1
        counter[thing] += 1

Or sometimes the fewer-line form of:

for thing in vals:
    counter[ thing ] = counter.get(thing, 0) + 1     # that 0 is the fallback if it is not in there (defaults to None)

The following is both cleaner and somewhat faster:

counter = defaultdict(int)     # int() happens to be 0
for thing in vals:
    counter[thing] += 1

Notes also the existence of collections.Counter

defaultdict is also useful for nested structures, where you can have it instantiate lists or dicts for you.

subnet_lister = defaultdict(list)
for ip in vals:
   subnet_lister[subnet_for(ip)].append( ip )


Fixed arguments, keyword arguments, and anonymous keyword arguments

All arguments have names. Arguments can be passed in two basic ways: via positional arguments and via keyword arguments.

See also:

Some more notes

This can be useful to make the actual call to a big initializer function forwards and backwards compatible: The call will not fail when you use new or old keywords. You can choose to write your code ignoring, warning, and throwing error as is practical.

Without this, the call itself may fail when you though you were using a different version - and it's annoying when different versions of the same thing are not drop-in replacements.

One note related to defaults (not python-specific):

It sometimes makes sense to make a function react with a default by passing a value like None rather than specifying the default value in the function definition - pieces of pass-through code that may or may not take a user/config value cannot easily rely on a function-definition default using **kwargs (with the value explicitly removed) in the call, or even if-thenning with slightly different calls of the same function (ew).

Member-absence robustness

More than once I've wanted to check whether an object has a particular member -- without accessing it directly, since that would throw a TypeError if it wasn't there.

One way is to use the built-in function
(which is just a getattr that catches exceptions). Another is to do
'membername' in dir(obj)

If you want to get the value and fall back on a default if the member wasn't there, you can use the built-in
, which is allows you to specify a default to return instead of throwing a TypeError.


Accessors (property)

What are known as accessors in other languages can be done in python too, by overriding attributes, largely syntactic sugar for creating the function.

The specific property() function a function that serves approximately the same purpose as, say, C#'s attribute syntax. Example:

class NoNegative(object):
   def __init__(self):
   def get_x(self):
      print "Get function"
      return self.internal
   def set_x(self,val):
      print "Set function"
      self.internal=max(0,val) #make sure the value is never negative
# Testing in the interactive shell:
>>> nn = NoNegative()
>>> print nn.x
Get function
>>> nn.x=-4
Set function
>>> print nn.x
Get function


  • The signature is property(fget, fset, fdel, doc), all of which are None by default and assignable by keyword.
  • You should use new-style objects (the class has to inherit from object); without the inheritance you make a more minimalistic class in which the above would use members instead(verify).
  • these won't show up in dir()s - they're not object members (which allows you to create shadow members and do other funky things)
  • Note that this indirection makes this slower than real attributes


Python doesn't hide the fact that class functions pass/expect a reference to the object they are working on. This takes getting used to if you've never used classes this way before.

Class functions must be declared with 'self', and must be called with an instance. This makes it clear to both python and coders whether you're calling a class function or not. Consider:

def f():
    print 'non-class'
class c(object):
    def __init__(self):
        f()      # refers to the function above. prints 'non-class'
        self.f() #refers to the function. Prints 'class', and the object
                 #note that self-calls imply adding self as the first parameter.
    def f(self):
        print 'class; '+str(self)
    def g(): #uncallable - see note below
        print 'non-class function in class'
c.f(o) #c(f) would fail; no object to work on

You can mess with this, but it's rarely worth the trouble.

Technically, you can add non-class functions to a class, for example g() above. However, you can't call it. self.g() and o.g() fails because python adds self as the first argument, which always fails because g() takes no argument. It does this based on metadata on the function itself -- it remembers that it's an class method, so even h=o.g; h() won't work.

You can stick non-class functions onto the object after the fact, but if you do this in a class-naive way, python considers these as non-class functions and will not add self automatically. You can call these, but you have to remember they're called differently.

Details to objects, classes,and types

New-style vs. classic/old-style classes

New-style classes ware introduced in py2.2, partly to make typing make more intuitive sense by making classes types.

Old-style classes stayed the default (interpretation for classes defined like class Name: and not class Name(object):) up to(verify) and excluding python 3. Py3k removed old-style classes completely and made what was previously known as new-style behaviour the only behaviour.

When the distinction is there,new-style classes are subclasses of the object class, indirectly by subclassing another new-style object or built-in type, or directly by subclassing object:

class Name(object):
     pass #things.


Initialization, finalization

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, or tell me)

See also:


Consider that classes are objects that are templates for instantiations.

Metaclasses are templates to instantiate classes.

Usually, they are used as a fancy sort of class factories.

Generally, they will cause a headache. Don't use them unless you know you must, or really, really want to.

See also


% is an operator that imitates C's (s)printf. For example:

"%3d.%20s"%( 1,'quu' ) == '  1.                 quu'

It expects a tuple (not a list). There is an extra feature in that you can refer to data in a dict:

"%(id)3d. %(name)s"%d

(Note: if something is not present, this will cause a KeyError)

The dynamic typing means things are slightly more flexible. You can e.g. do "%s"%(3)

__underscore__ things

Note: the __rsomething__ variants for operators were ommited from this list for readability. See the notes on them below.


  • __doc__: docstring
  • __class__: class/type
  • __name__: name (of the module, class, or function/method)
  • __module__: module of definition (for classes, functions/methods)

Helper functions for internal handling, some one specific types, and sometimes conveniently accessible variations on built-in functions:

  • __hash__, __cmp__ (see also __eq__; these three interact in objects when you want them to be usable as dictionary keys [6])
  • __len__
  • __init__: constructor
  • __del__: destructor
  • __new__ allows you to create a bew object as a subtype of another (mostly useful to allow subclasses of immutable types) (new-style classes only)
  • __repr__: supports repr().
  • __reduce__
  • __reduce_ex__
  • __coerce__ (not always the same as the built-in coerce()(verify))

Helpers for syntax and operator handling (various are overloadable) (see also the operator module):

  • __contains__: supports in
  • __call__: for ()
  • __getattr__, __getattribute__, __setattr__, __delattr__: for member access (using .) (See notes below)
  • __getitem__, __setitem__, __delitem__: for [key]-based access (and preferably also slices; use of __getslice__, __setslice__, __delslice__ is deprecated). See also things like [7]

  • __int__, __long__, __float__, __complex__: supports typecasts to these
  • __hex__, __oct__: supporting hex() and oct()
  • __abs__: supports abs()

  • __and__: for &
  • __or__: for |
  • __lt__: for <
  • __le__: for <=
  • __gt__: for >
  • __ge__: for >=
  • __eq__: for ==
  • __ne__: for !=

  • __add__: for +
  • __pos__: for unary +
  • __sub__: for binary -
  • __neg__: for unary -
  • __mul__: for *
  • __div__: for /
  • __truediv__: for / when __future__.division is in effect
  • __floordiv__: for //
  • __mod__: for %
  • __divmod__: returns a tuple, (the result of __floordiv__ and that of __mod__)
  • __pow__: for **

  • __lshift__: for <<
  • __rshift__: for >>

  • __xor__,: for ^
  • __invert__: for ~


  • __all__: a list of names that should be considered public(verify) (useful e.g. to avoid exposing indirect imports)

  • __nonzero__: used in truth value testing

__i*__ variants

All the in-place variations (things like +=) are represented too, by __isomething__. For example: __iadd__ (+=), __ipow__ (**=), __irshift__(<<=), __ixor__ (^=), __ior__ (|=) and so on.

Why += isn't the same as + except when it is


a = [1,2,3]
c = a
a += [4]
a = a + [5]

What does it output? Well, list + list like a + [5] creates a new list.

And we learned

a += b

is equivalent to

a = a + b

so it does exactly the same thing, right?

So we created a new list twice and it would output [1,2,3,4,5] [1,2,3] right?

No. It's:

[1, 2, 3, 4, 5]
[1, 2, 3, 4]

What gives?


  • + is short for __add__
  • += is short for __iadd__

And, here's the crux:

  • += check for the presene of __iadd__ (in-place add). If it's there, we use that. If not, we fall back to evaluating with __add__ and assigning the result.

So the meaning of += is dynamic:

  • if the left side is mutable, like list, the two are not the same.
  • if the left side is immutable, like with str or int, both are the same: evaluate-new-value-and-assign

(if you replace the lists with string "123" and "4" and "5" you do get 12345 and 123)

This also means

  • you really shouldn't implement iadd on immutable objects

My take-away is that, like in the C days, operator overloading is nasty because the semantics are hidden. Avoid it where possible.

__r*__ variants

the __rsomething__ are variations with swapped operands. Consider x-y. This would normally be evaluated as


If that operion isn't supported, python looks at at whether y has a __rsub__, and looks whether it can instead evaluate as:


The obvious implementation of both makes their evaluation equivalent. This allows you to define manual types which can be used on both sides of each binary operator and do so non-redundantly, and with a self-contained definition.

(Only for binary operator use: doesn't apply to the unary - or +, or the ternary **)

__getattr__ and __getattribute__

For old-style classes, if normal member access doesn't find anything, the __getattr__ is called instead.

In new-style classes, __getattribute__(self, name) is used for all attribute access. __getattr__ will only be called unless __getattribute__ raises an AttributeError (and, obviously, __getattr__ is defined)

Since __getattribute__it is used unconditionally, it is possible to create infinite loops when you access members on self in the style. This is avoided by explicitly using the base class' __getattribute__ for that.

See also: