Venvs and packaging - Python

From Helpful
(Redirected from Python packaging)
Jump to navigation Jump to search

Virtual environments and packaging

The problem: when installs break other installs

Language agnostic packaging · App images

Python packaging · Ruby packaging · Rust packaging · R packaging


Python virtual environments

Without getting into specific implementations of the idea, the concept of virtual environments means that you get control of the python environment for a specific case.


You could, in theory, give each piece of python software its own little ecosystem.


Problems and choices

Doing that wouldn't necessarily be any better, or at least, not any clearer, or any better for anyone.

That's just app images with extra steps.


And, at first, what all those instructions do isn't at all clear.

It just sorta works. Except sometimes when it doesn't? And then you don't even know who to ask?


In fact, people keep re-inventing these things without really explaining what it does, which is exacarbated by the fact that they tend to solve different sets of problems at the same time.

Whenever I see an introduction just says "it's super simple and fast" and give three commands and then stop talking, you might as well ignore it.


Or, if you haven't seen this five times this year, at least start asking questions.

Does it work in the context of notebooks? Idunno.
Will it even work in scripts? Idunno.
Is it actually project management that happens to do this because it needs to? I dunno.
Is that project management an archaic badly-supported variant now? I dunno.
Is it also managing shell environment? Idunno.
Will it work in clusters without cooperation from a sysadmin? Idunno.
(so) will you need to abandon it in certain cases? Idunno.
Does it have its own packages? Idunno.
Wait does it define that particular file or is that shared between things? Idunno.
What python versions will this work for? Idunno.
Does it also manage python versions? Idunno.
Is this wildly incompatible with something else I probably need? Idunno.


Yet some of them are great. And maybe you don't want to go years without picking up something useful.


virtualenv (python2, python3)

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

(see also the similar but distinct conda virtual environments)


virtualenv allows you to create a directory that

  • represents a specific python interpreter
often used to ensure it uses a specific python version
  • represents a distinct set of packages
...that apply on top of the system ones (by default; particularly some later derivates have their own rules here)
  • represents tools like setuptools and pip that will install into the environment these commands come from
...instead of into the system


This makes it useful

  • whenever where you need a specific python and/or specific supporting libraries,
e.g. isolated apps that should run regardless of their environment
  • also, user scan run their own things without having to bug admins much about specific installation.


Example - creating

Assuming that

  • the default python is python3.8
  • you run virtualenv NAME

Then you'll now have at least:

  • ./NAME/lib/python3.8/site-packages
  • ./NAME/bin
    • ./NAME/bin/python
    • ./NAME/bin/python3.8 - a copy of the the python interpreter)
    • ./NAME/bin/pip - installs into this environment.
    • ./NAME/bin/activate - lets you use this environment in the shell sourced through bash to use it)

...which is site-packages, setuptools, a copy of the python interpreter that uses this environment, and a few other things (e.g. recently also pip, wheel).

(in the 2.something days there were a number of differences, but the same idea)



Example - using

It's useful context to know where import looks (see also [1]) - e.g. that sys.path is initialized with the script's (if run via a script hashbang) or the python executable's (if invoked directly) containing path (then site stuff, then then PYTHON_PATH)


There are various ways to use that resulting file tree:

  • run source NAME/bin/activate is most typical
prepends the path to that python binary, meaning running 'python' will be the one being run over others
  • run the python binary in there
activate_this = '/path/to/env/bin/activate_this.py'
execfile(activate_this, dict(__file__=activate_this))

See also

https://www.dabapps.com/blog/introduction-to-pip-and-virtualenv-python/


Reproducing the same set of packages elsewhere isn't a virtualenv feature, but but something you typically want to do.

This is often done via pip freeze and pip install -r





https://docs.python-guide.org/dev/virtualenvs/


"No module named 'virtualenv.seed.via_app_data'"

Seems to indicate conflicting versons of virtualenv (yes, the virtualenv module that has been around since python2(verify). It's still perfectly usable in python3).

https://github.com/pypa/virtualenv/issues/1875

venv (python3)

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

venv is a module and tool introduced in py3.3.


Much like virtualenv, but became standard library (and cleaned up some details)


https://docs.python.org/3/library/venv.html

virtualenv/venv and packaging

Finding what virtual environments you have lying around

pipenv

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Conceptually, pipenv is the the combination of...

pip
and the virtualenv concept,

...giving you installs into isolated projects.


You may care that

  • the project directories it sets up will not be as cluttered as in virualenv/venv.
  • It also considers code versioning, in that
    • Users need only care about the Pipfile in a directory
    • the software you install is stored elsewhere, and potentially shared
~/.local/share/virtualenvs/ (rather than in env/lib within each project, as with virtualenv/venv), which e.g. keeps things cleaner around code versioning.


You can start a new one like:

mkdir myproj
cd myproj
pipenv --python 3.6   # create project in curdir.  Optional (`pipenv install` creates a Pipfile too), but this way you control py version
pipenv install numpy  # install software in this environment

The only thing it puts in this directory is a Pipfile (there is more, but it's hidden in your homedir, because it's potentially shared state)


You can then start a subshell, for the environment implied by the current directory, like:

pipenv shell


See also:

Homebrew

https://en.wikipedia.org/wiki/Homebrew_(package_manager)


Anaconda, miniconda, conda (python and more)

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

These amount to its own package manager and its own environments isolation

With the aims to

more easily reproduce environments,
more controllable versions of things,
more portability within win+lin+osx,
do more than pure python (or python at all)
(avoid having a compilation environment - see the last two points)

...and seems aimed somewhat at academia


anaconda is a download-a-few-gigabytes-of-the-most-common-stuff-up-front, base of common things you can select from.

which may be most or all that you'll use

miniconda is a 'start installing from scratch' variant of the same thing

it mainly bootstraps the repository system, and downloads everything on demand


conda is the package manager they share


Conda environments

A conda environment is a distinct installation of everything, including python itself (if it's in there, which it usuaolly is. But if not, you may get a system python, pyenv shim, or such).


Consider e.g.:

$ which python3
/usr/bin/python3
$ conda env list
# conda environments:
#
base                  *  /home/me/miniconda3 
foo                      /home/me/miniconda3/envs/foo
$ conda activate base
(base) $ which python3
home/me/miniconda3/bin/python3

"activating" a conda environment just places it first in resolving its bin/, which includes the executables of conda packages installed into that environment.

Notes:

this also helps separate it from anything relying on your system python being on the PATH
and is why you generally wouldn't add the conda bin to your path directly


as to cleaning: https://stackoverflow.com/questions/56266229/is-it-safe-to-manually-delete-all-files-in-pkgs-folder-in-anaconda-python


Getting conda in your shell

That the above assumed that you can already run your own conda, but that is something you have to set up. There are two practical parts to that:

  • getting conda into your PATH
  • whether or not it should activate the a base conda environment by default
in a new install it will do this, because auto_activate_base is true

The thing that conda init hooks in does both.

During the install there's a question whether to do that. If you said no but want this later, get to your conda command and do conda init)}}

If you want it to just put conda in the path and not active the base environment, you'll want to:

conda config --set auto_activate_base false



Workflow stuff - environments and dependency files

You could treat conda as one overall environment, but you probably want to isolate projects:

conda create -n yourenvname python=x.x anaconda

difference between conda create and conda env create?

source activate yourenvname


Note that conda environments are not really compatible with virtualenv or pipenv due to specific features.

So yeah, if you previously had the virtualenv idea in your project workflow to recreate environments elsewhere, you'll need to switch to conda for that. Consider e.g.

conda env export > environment.yml

and

conda env create -f environment.yml

(environment.yml is conda's conceptual equivalent of requirements.txt)


You can' use pip within a conda environment, and can hook pip installs into such conda environment YAML files[2]

https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/#Myth-#5:-conda-doesn't-work-with-virtualenv,-so-it's-useless-for-my-workflow


See also:

pyenv

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Lets you

  • install and select user-specific versions of python,
    • and packages
  • do virtual environments (optional)
so until you use them, things using the same python version will share their own dist-packages


Pyenv can pick up the system version, but all others would be installed by it, and this is arguably more controlled.

There is one special version name, system, which effecively means "whatever's on the path". Before you install any of your own, pyenv versions would only mention system.




Show installed versions

pyenv versions

Set preferred python version for current user (before you first set this, you will probably be using system)

pyenv global

Set preferred python version for current shell only

pyenv shell

Set preferred python version under directory (creates a .python-version)

pyenv local


All the python variants pyenv knows how to install:

pyenv install -l

Installing one:

pyenv install 3.9     # which might e.g. install 3.9.16


pyenv and virtual environments

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

If you also wanted virtual environments, this is something you do separately [3].

Consider

pyenv virtualenv 3.9 NAME

This creates a new environment by that name (within $PYENV_ROOT, not the current directory), which you can now select with e.g. pyenv local.

It's good practice to name it clearly, and possibly include the python version it's based on.


I had questions

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

"Where does it install python versions? (and packages)"

Under $PYENV_ROOT/versions/

which in user setups (probably most) will be ~/.pyenv/versions/.
and which before installing anything is empty (you're still using system)

Note that it will add shims (which pretend to be the main executables) in /root/.pyenv/shims, which is directly in the PATH.


So if you pyenv install 3.9.2 and 3.9.16, python3.9 is a shim that resolves that.

And if none is considered activated, that shim will fail.


"Where does it store version/env preferences?"

pyenv global goes to ~/.python-version

pyenv local goes to .python-version in the directory you execute that.

pyenv shell goes to PYENV_VERSION environment variable


"What if there are multiple preferences set?"

More specific override more general - PYENV_VERSION over .python-version over ~.python-version and if nothing is set you get the system python. (verify)


"Does it pick up system-installed packages?"

Unless it ends up picking system, no.

It seems to be pyenv's position that this is a mistake, that the whole point of pyenv is to have your own, that cannot conflict with or break your system python install.


You can, in theory, symlink the system python under ~/.pyenv/versions/ but this might be dangerous around the uninstall command(verify).


https://github.com/pyenv/pyenv

pew

https://github.com/pew-org/pew

uv

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


https://docs.astral.sh/uv/

Not quite venvs, but useful because of them

pipsi

No longer maintained. See pipx instead.

pipx

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Seems geared specifically to answer "how can I make stuff usable in my terminal, while having them come from a venv-like thing rather than the system install?"

Does not install into activated virtualenv.

Does not provide a virtualenv.

Just installs in a way that happens to use a virtualenv to ensure it runs(verify).


It itself relies on pyenv, and it uses whatever python version is current; if you don't know what that means, chances are that's the system python.


Upsides:

  • makes end-user things just work
  • ...so removes worry about installing things
  • Like other pip-likes, allows installs from github repos,
so can be quick ways to start using someone else's tools
(not all versions?)


Limitations:

  • not script/cron-friendly, because it depends on a specific PATH entry, that pointing to a specific profile directory
  • python 3.6+ only
  • has its own dependencies; can't really dependency on specific libraries
(which is probably a good thing, but means it's more useful to finished things, and less useful during development)
  • because it installs its own thing, this is probably less useful for things that provides both a library and CLI
because this will not address the library side, and may go out of sync with whatever else you use for that


How to make something installable with pipx

It will (only) link in the things your package explicitly lists as runnable:

See also [4]


How does that work?

For example, when you

pipx install pycowsay

then it ends up installing that in

$HOME/.local/pipx/venvs/pycowsay/

and, more relevant for you, it creates the symlink...

$HOME/.local/bin/pycowsay

...that point to...

$HOME/.local/pipx/venvs/pycowsay/bin/pycowsay

All you really need is to have /root/.local/bin/ in your PATH.


Note that this venv may



See also:

Side note on freezing

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

(note: this is unrelated to package managers freezing a package, which is basically just listing packages and their versions, usually to duplicate elsewhere)


Freezing means wrapping your code so that it does not depend on anything other than the frozen product.

This usually meaning a copy of the a python interpreter, all modules it depends on, and some duct tape to make that work independently elsewhere.


It often creates a relatively large directory, and doesn't really let you alter it later. It's analogous to app images.

The main reason to do this is to have a self-contained copy that should run anywhere (in particular, it does not rely on an installed version of python) so is nice for packaging a production version of your desktop app.



To do this yourself, you can read things like https://docs.python.org/2/faq/windows.html#how-can-i-embed-python-into-a-windows-application


It's easier to use other people's tools.

Options I've tried:

  • cx_freeze
lin, win, osx
http://cx-freeze.sourceforge.net/
  • PyInstaller
lin, win, osx
can pack into single file
http://pyinstaller.python-hosting.com/
See also http://bytes.com/forum/thread579554.html for some get-started introduction

Untried:

lin, win, osx
  • py2exe [6] (a distutils extension)
windows (only)
can pack into single file
inactive project now?
  • Python's freeze.py (*nix) (I don't seem to have it, though)
mac OSX (only)
  • Gordon McMillan's Installer (discontinued, developed on into PyInstaller)


See also:


TODO: read:




Installing and creating python packages

Doing package installs

tl;dr

  • for system installs
pip (or similar) will install into the same dist-utils your system package manager
system package manage should mix decently with pip installs
but it can gets confusing when you have one install things the other isn't aware of
So you might want to prefer using just one as much as possible
and it's a secondary reason that virtualenv installs keeps things clearer in more custom setups


  • for distinct stacks (dev, fragile apps)
consider virtualenv
consider pipenv, conda, and similar -- sometimes doing something virtualenv-like for you is often simpler/cleaner


  • creating and uploading packages
look at distribute (basically a nicer setuptools)


pip notes

Install

python -m pip

The advice to use python -m pip instead of pip comes mostly from it being more obvious which of the installed python versions you're referencing.

It's otherwise identical.

pip search is dead, long live the alternatives
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


pip search never did much more than a substring search in name and summary, but the API it relied on was always considered experimental, and there was a long-term (and possibly accidental) DDoS going on that made hosting costs high. So they shut that down (See https://github.com/pypa/pip/issues/5216 and https://status.python.org/incidents/grk0k7sz6zkp for details)


pip may be working on a local replacement based on pip index (verify), but in the meantime, alternatives include:

  • pip_search (seems to scrape pypi website)
pip install pip_search
pip_search scikit
  • pypisearch (seems to require py>=3.8, though)
git clone https://github.com/shidenko97/pypisearch & cd pypisearch & pip install .
python -m pypisearch scikit


Install from git
Can you update all packages?

It's not really set up for it.

There are a few hacks.


There is

pip list --outdated

and there is a third party tool called pip-review that lets you interactively choose which of those to update:

pip-review --interactive
User installs
  • pip run as a non-root user seems to act as if --user was specified(verify)

pip and dependencies

showing package dependencies

For installed packages,

pip show spacy

...will show something like (some formatting added here):

Name:         spacy
Version:      3.5.0
Summary:      Industrial-strength Natural Language Processing (NLP) in Python
Home-page:    https://spacy.io
Author:       Explosion
Author-email: contact@explosion.ai
License:      MIT
Location:     /usr/local/lib/python3.8/dist-packages
Requires:     catalogue, cymem, jinja2, langcodes, murmurhash, numpy, packaging, 
              pathy, preshed, pydantic, requests, setuptools, smart-open, 
              spacy-legacy, spacy-loggers, srsly, thinc, tqdm, typer, wasabi
Required-by:  collocater, en-core-web-lg, en-core-web-md, en-core-web-sm, 
              en-core-web-trf, nl-core-news-lg, nl-core-news-md, 
              nl-core-news-sm, spacy-experimental, spacy-fastlang, 
              spacy-transformers


Note that the required-by only list things that require it and you have installed, not all the possible things, so will vary between installations.

A user installed package will show a different location, e.g.:

Name: jedi
Version: 0.18.1
Summary: An autocompletion tool for Python that can be used for text editors.
Home-page: https://github.com/davidhalter/jedi
Author: David Halter
Author-email: davidhalter88@gmail.com
License: MIT
Location: /home/me/.local/lib/python3.8/site-packages
Requires: parso
Required-by: ipython

Development

Reproducing the same set of packages elsewhere

One convention that has grown into a de facto standard is to create a file, usually called requirements.txt, that contains the package-and-version specs for each library you want.

Each (non-comment) line is essentially arguments to a unique call to the pip CLI tool, and is parsed by pip; it is e.g. the pip documentation that notes that yes, you could add comments


So options include

FooProject
FooProject >= 1.2
FooProject >= 1.2 --global-option="--no-user-cfg"

(as this documentation mentions, the last line is roughly equivalent to going into FooProject 1.2 source and running python setup.py --no-user-cfg install)


requirements.txt appens to combine well with virtual environments

Say, if you just created a venv, you can now do:

pip install -r requirements.txt

...and now that venv should contain everything that project needs to run.


Similarly, if you are currently within a venv, you can create a requirements.txt like

pip freeze > requitements.txt
💤 This will, however, be overly specific. That is, it will probably list the precise version you have installed, such as:
 webencodings==0.5.1
 WebOb==1.8.7
 websocket-client==0.53.0
 websockets==10.4
 Werkzeug==2.2.2

and you might actually want to edit that to be more accepting, if you want it to accept updates from each library.


pipenv originated in part from trying to make things even simpler than those manual steps of creating and picking up requirements.txt

Editable installs
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.



PyPI notes

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Install related errors

DBus error on python package installs

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
No such interface “org.freedesktop.DBus.Properties” on object at path /org/freedesktop/secrets/collection/login

When you use something like pip, or something more complex like poetry or twine.


You'll probably see packages like keyring and secretstorage.

If you didn't actually need auth storage, then prepending

PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring

to your command should be a good test whether keyring is the problem - and be a good temporary workaround.


https://unix.stackexchange.com/questions/684854/free-desktop-dbus-error-while-installing-any-package-using-pip



Creating python packages

Some context on python packaging

Packaging was initially a bit minimal, and pasted on, There was also the need to have convenience during development. This and more led to people alternatives and duct taping, and python's packaging history is a bit of a messy confusion.


💤 History we can now mostly forget about

We had

  • distutils (2000ish)
standard library
  • PEP-273 introduced zip imports (2001)
can be copied into place
is then mostly equivalent to having that thing unzipped in the same location
...with some footnotes related to import's internals.
  • PyPI (2003)
meant as a central repository
initially just a repository of links to zips elsewhere, which you would manually download, unpack, and either setup.py install (distutil stuff) (or sometimes just copy the contents to site-packages)
  • setuptools (2004)
introduced eggs
introduced easy_install (which these days is no longer used)
  • egg (2004, see previous point. Never put into a PEP)
eggs are zip imports that adhere to some extra details, mostly for packaging systems, e.g. making them easier to discover, their dependencies resolved, and installed.
there are some variants. A good readup involves the how and why of setuptools, pkg_resources, EasyInstall / pip, and more
Ideally, you can now skip eggs
  • distribute (2008)
fork of setuptools, so also provides setuptools
had a newer variant of easy_install (from distribute, so ~2008)
(how relevant is this one?)
  • distutils2 (~2010) - made useful contributions, apparently not interesting as its own thing[8]

More interestingly, though...


  • pip (2008)
intended to replace easy_install
more aware of depdendencies (verify)
can uninstall; easy_install could not
downsides:
cannot install eggs (seemingly because we wanted to replace eggs with wheels?(verify))
doesn't isolate different versions (verify)
limited to python - C dependencies are still ehhh
  • wheel format is introduced (2013; PEP-427, PEP 491 ) as replacement for egg format.
intended to be a cleaner, better defined thing, for just installs - see On wheels
  • PEP-518 introduced pyproject.toml (2016)
specifies what build tools you require to build a wheel
better defined than setup.cfg did(verify) and with whatever version of setuptools someone would have installed which you would have no control over
(and using TOML format to be easier than ini/configparser?


developers giving developers installs

developers giving users installs

specifying packages

setup.cfg notes
setup.py notes
pyproject.toml notes
On wheels
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


any of this and docker

Dependency specifiers

See also:

Creating packages

Manual

Flit

Tries to make it easier for you to publish to PyPI

https://flit.pypa.io/


PDM

https://pdm.fming.dev/


Hatchling / Hatch

Hatch is a project manager.

Hatchling it its build backend.


https://pypi.org/project/hatchling/


Poetry

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.