Python usage notes - importing, modules, packages

From Helpful
Jump to: navigation, search
Various things have their own pages, see Category:Python. Some of the pages that collect various practical notes include:


Sometimes handy

Import fallbacks

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

You've probably seen the fallback import trick, such as:

import StringIO
try:
    import cStringIO as StringIO
except ImportError:
    pass

or

try:
    set
except NameError:
    from sets import Set as set

or

try:
    import cElementTree as ElementTree
except ImportError:
    import ElementTree

For ElementTree you may want something fancier; see Elementtree#Importing


Reference to current module

There are a few ways of getting a reference to the current module (which is rarely actually necessary, and note that if you need only the names of the members, you can use dir() without arguments).


The generally preferred way is to evaluate sys.modules[__name__], because this is needs no knowledge of where you put that code, and can be copy-pasted directly. (The variable __name__ is defined in each module and package (it will be '__main__' if the python file itself is run as a script, or you are running python interactively).


Another way is to import the current module by its own name, which actually just binds the by-then-already loaded module, to a name that happens to be in its own scope (will also work for __main__).

There are a few details to this, including:

  • you shouldn't do this at module-global scope(verify), since the module won't be loaded at that point
  • will work for packages, by its name as well as by __init__, but there is a difference between those two (possible confusion you may want to avoid): the former will only be a bind, while the latter is a new name so may cause a load, which might pick the pyc file that Python created, so while it should be the same code it may not be id()-identical (...in case that matters to your use)

Importing and binding

In general, importing may include:

  • explicit module imports: you typing
    import something
    in your code
  • implicit module imports: anything imported by modules, and package-specific details (see */__all__)
  • binding the module, or some part of it, as a local name


Module imports are recorded in sys.modules, which allows Python to import everything only once.

All later
import
s fetch the reference to the module from that cache and only bind it in the importing scope.



Binding specific names from a module

You can also specify that you want to bind a few names from within a module. Say you are interested in the function comma() from lists.format (package lists, module formats). You can do:

import format.lists
# binds 'format', so usable like:
format.lists.comma()
 
 
from format import lists
# binds lists (and not format), so:
lists.comma()
 
 
from format import lists as L
# locally binds lists as L (and neither format or lists), so:
L.comma()
 
 
import format.lists as L
# same as the last
L.comma()
 
 
from format.lists import *
# binds all public names from lists, so:
comma()
 
 
from format.lists import comma
# binds only a specific member
comma()
 
 
from format.lists import comma as C
# like the last, but binds to an alias you give it
C()


None of this changes importing, it's only different in what names get bound, and all just personal taste. (e.g. I like to avoid from and as, forcing my own code to mention exactly where it gets is functions)

Packages

For context, modules are almost any python file file (where its filesystem name is not a syntax error in python code).

It's good practice to be clean in the files you'll import from, but there is no special status.


Packages are a little extra structure on top, an optional way to organize modules.

While modules correspond to files, packages correspond to directories containing modules.


A package is a directory with an __init.py__ file. In the real world, many are empty or contain only an informative docstring, because the most basic use of packages is just namespacing your modules in a useful way.

Beyond grouping things under a common name, it allows things like

  • running code when the package is first imported, usually for some initial setup
...put it in the __init__.py file
  • allowing selective import of the package's modules (forcing people to ask for some things explicitly)
in part just syntax and convenience, like from image import fourier


importing from packages
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

The examples below assume a package named format with a module in it called lists, which you can quickly reproduce with:

mkdir format
echo 'print "format, __init__"'  > format/__init__.py
echo 'print "format, lists"'     > format/lists.py


In an import, everything up to the last dot has to be a package/subpackage, and the last part must be a module.

The package itself can also be imported, because a __init__.py file is a module that gets imported when you import the package (or something from it) and aliased as the directory name. With the test modules from the last section:

>>> import format.lists
format, __init__
format, lists

The import above bound 'format' at local scope, within which a member 'lists' was also bound:

>>> format
<module 'format' from 'format/__init__.py'>
>>> dir(format)
['__builtins__', '__doc__', '__file__', '__name__', '__path__', 'lists']
>>> format.lists
<module 'format.lists' from 'format/lists.py'>


Modules in packages are not imported unless you (or its __init__ module) explicitly do so, so:

>>> import format
format, __init__
>>> dir(format)
['__builtins__', '__doc__', '__file__', '__name__', '__path__']

...which imported no lists.


Note that when you create subpackages, inter-package references are resolved first from the context of the importing package's directory, and if that fails from the top package.

importing *, and __all__

Using import * is a special case.

(Also generally discouraged for namespace-pollution reasons)


For modules:

  • if there is no __all__ member, all the module-global names that do not start with an underscore are bound
  • if there is an __all__ member, only the names in that list are bound

So __all__ is useful when a programmer likes to minimize namespace cluttering from their own modules.


For packages, * is different; if __all__ were not present, python could only determine members based on filenames, but this would be unreliable on platforms such as Windows (as Windows deals with capitals on the filesystem somewhat creatively).

As consistent behaviour was preferred, for packages the process of importing only looks at __all__ in the package. If this is not present, there is no implicit binding.

For the example above, no __all__ means from format import * will only import format. When you add something like __all__= ['lists'] to the __init__.py, it will import the package as well as the modules listed there, and bind those modules in the package object.


Note that some people prefer to not use import * at all, since it clutters one's namespaces and makes name collisions more likely than if you have to explicitly define those collisions.

It may be well received to, if you have helper functions, to place those in a separate module so that import * from such modules will only import a set of well-named helper functions.


Note that the import keyword is mostly a wrapper around existing functions, which in a few cases you may want to use directly.

Lower levels

Compiled (byte)code

  • .py - source text
  • .pyc - compiled bytecode
  • .pyo - compiled bytecode, optimized. Written when python is used with -O. The difference with pyc is currently usually negligible.
  • .pyd - a (windows) dll with some added conventions for importing
(and path-and-import wise, it acts exactly like the above, not as a linked library)
note: native code rather than bytecode


All of the above are searched for by python itself.

Python it will generate pyc or pyo

when they are imported them (not when the modules are run directly)
...with some exceptions, e.g. when importing from an egg, or zip file it will not alter those
...which means it can, for speed reasons, be preferable to distribute those with pyc/pyo files in them


There is some talk about changing these, see e.g. PEP 488


See also:


eggs, zips, eggdirs

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

A package can be installed as a .zip file - this is basically equivalent to having it unzipped in the same location (...with some footnotes related to import's internals).


Eggs chooses to adhere to some extra details, make it easier for the packaging that exists out there to do its thing - making it easier for such code to be discoverable, their dependencies resolved, and installed.

A good readup involves the how and why of setuptools, pkg_resources, EasyInstall / pip, and some more.

At a file level, eggs come in a few variants:

  • directory with egg metadata (e.g. EGG-INFO/)
  • .egg file, which is a zip file, with exactly those contents
  • .egg-info/ directory placed alongside
mostly exists for backwards compatibility


An egg file should contain either a file called zip-safe or not-zip-safe zip-safe means the package works as intended when isntalled as a single .egg file. not-zip-safe means it should be unpacked into the directory style(verify) [1]

This is irrelevant to most users of other's libraries, but explains a thing that may have confused you about your site-packages.


http://peak.telecommunity.com/DevCenter/EggFormats


wheel

Wheels are intended as a replacement for eggs.

https://pythonwheels.com/

where import looks

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


import
looks in all of
sys.path
, which is constructed as:
  • sys.path[0] is the directory that contains the script being executed [2]
...or
''
(empty string) if python is invoked interactively, or the current directory cannot be determined(verify)


  • entries via the implicit import of site [3]
(site is imported when an interpreter starts)
it does a few different things, including
combining the compiled-in values of sys.prefix and sys.exec_prefix (often both /usr/local) with a few suffixes, roughly lib/pythonversion
setting PYTHONHOME overides that compiled-in value with the given value.[4]
Embedded / fully isolated pythons may specifically want to [5]. Most other people do not.
...in part because of what it doesn't mean for subprocesses calls
use of virtualenv overrides these


  • entries from PYTHONPATH
https://docs.python.org/release/3.2.3/using/cmdline.html#envvar-PYTHONPATH
Intended for private libraries, when you have reason to not install them into a specific python installation (...or you can't)
avoid using this to
switch between python installations - that is typically better achieved by calling the specific python executable
import specific modules from another python installation - just install it into both


  • entries you add in code (before the relevant import), be it from
    • simple sys.path.append() (docs: "A program is free to modify this list for its own purposes."[6])
    • site.addsitedir() (see notes on site below)


On site

Site is handy flexible for packaged stuff.

site.addsitedir()

mostly just looks for and processes .pth files.
(Which site does for site-packages/site-python dirs when it is imported)
and is otherwise pretty equivalent to adding to


https://pymotw.com/2/site/

site's behaviour includes:

  • .pth files (from site-packages only)
see https://docs.python.org/2/library/site.html
  • using site.addsitedir()
Like the last, but considers .pth files where applicable ()
and sitecustomize
  • sitecustomize.py, which adds via site looks for in each sys.path entry
was intended for platform-specific things, development tools
  • usercustomize.py
like sitecustomzize, but in a user's dir, and loaded after
you'ld do something like
site.addsitedir( os.path.expanduser(os.path.join('~', 'mypy')) )


Other considerations

You often want to choose to isolate yourself from the system python, or not.


If you customize some of this, you need to think harder about subprocess calls to your own python scripts.

os.environ['PYTHONPATH'] = ':'.join(sys.path)


This technically means the behaviour varies between

  • invoked with -m
  • invoked with -c
  • invoked interactively
  • invoked as script


https://stackoverflow.com/questions/15208615/using-pth-files

https://docs.python.org/3/library/site.html

https://docs.python.org/release/3.2.3/using/cmdline.html#envvar-PYTHONHOME

https://leemendelowitz.github.io/blog/how-does-python-find-packages.html -->

virtualenv

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

virtualenv (for Python) allows you to create a directory that represents a distinct environment (of modules, packages), and for a specific python installation.


It can be handy for things like:

  • isolating an app's libraries from the system libraries
can be handy to avoid name conflicts, avoid apps breaking on module upgrades, and a few other potential portability issues
  • install packages into your home directory
can be useful if your app needs libraries but you can't install them as an admin (or don't want to)
  • have different versions of the same package installed
can be useful for isolated testing
  • making an app go to a specific (installed) python version
defaults to system python?(verify)


Example

Creating

Assuming that python2.6 is your system python and that you run
virtualenv NAME
, you'll now have (at least):
  • ./NAME/lib/python2.6/distutils
  • ./NAME/lib/python2.6/site-packages
  • ./NAME/include
  • ./NAME/bin
    • ./NAME/bin/python
    • ./NAME/bin/python2.6
    • ./NAME/bin/easy_install (installs into this environment)
    • ./NAME/bin/activate (lets you use the environment in the shell -- must be sourced through bash)

...which is site-packages, setuptools, an interpreter that uses this environment, and a few other things (e.g. recently also pip, wheel).


Using

There are various ways to use that resulting file tree:

  • you can running
    source NAME/bin/activate
    , which mostly just prepends the path to that binary, meaning that runs of the binary will be the one in the environment. (verify) This can be useful for shell use, testing with sheel scripts, and such.


  • you can run the python binary in there
suggesting it always checks relative to where the binary is?(verify)


  • you can switch to an environment from within a running interpreter ((verify) details)
activate_this = '/path/to/env/bin/activate_this.py'
execfile(activate_this, dict(__file__=activate_this))


  • You could snatch stuff in a non-virtualenv but from a virtualenv install like: (verify)
import site
site.addsitedir('/path/to/myvirtualenv/lib/python2.6/site-packages')
note that you can do something similar without virtualenv, but even then virtualenv tends to be useful to isolate anything that e.g. wants to be python setup.py install'd (verify)

Distribution, Installing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Packaging

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)



Freezing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Freezing means packaging your code in an executable way -- i.e. with the python interpreter and all external modules.

This does create somewhat large executables (as it amounts to static linking), but the self-containedness is nice for distributing completely independent software (as in, does not rely on an installed version of python).


To do this yourself, you can read things like https://docs.python.org/2/faq/windows.html#how-can-i-embed-python-into-a-windows-application


It's easier to use other people's tools.

Options I've tried:

  • cx_freeze
lin, win, osx
http://cx-freeze.sourceforge.net/
  • PyInstaller
lin, win, osx
can pack into single file
http://pyinstaller.python-hosting.com/
See also http://bytes.com/forum/thread579554.html for some get-started introduction

Untried:

lin, win, osx
  • py2exe [8] (a distutils extension)
windows (only)
can pack into single file
inactive project now?
  • Python's freeze.py (*nix) (I don't seem to have it, though)
mac OSX (only)
  • Gordon McMillan's Installer (discontinued, developed on into PyInstaller)


See also:


TODO: read:



Installing python stuff - eggs, easy_install, pip, etc.

See also

See also