Language agnostic packaging
Virtual environments and packaging The problem: when installs break other installs Language agnostic packaging · App images Python packaging · Ruby packaging · Rust packaging · R packaging |
environment modules
Environment modules are the more flexible and configurable version of the basic duct-tape fix mentioned above.
It is itself a scripting language (Tcl plus some helpers).
Each module you write has a well defined set of operations for it to be loaded.
End users mostly just need to know:
module load progname[/version]
Upsides
- module load is easy to explain
- makes it easy to
- have different versions of modules
- give specific subsets of modules to different people,
- help deal with specific dependencies, in that you can write those as other modules to be loaded.
- can make sense to also do on cluster nodes, so putting specific module loads in your scripts is a lot more controlled
Limitations
It still only changes the environment that it's run from, so can't fix nonsense like scripts hardcoding their hashbangs.
first-time setup
You need to alias module to your installed modulecmd.
If you want only a few users to use this, look at the add.modules command, which edits your personal shell files to hook it in (can deal with a few different shells).
Sysadmins may want to put that in (these days) /etc/profile.d/modules.sh, so that user shells get it automatically.
Scripts can source the same thing explicitly, which can be useful/necessary on queue systems.
MODULEPATH, which controls the places where modulecmd looks for module files,
will often be set in the same central place (unless you want more control of the sets of modules).
Notes:
- admins sometimes wish to vary MODULEPATH, for example when admins want to expose different sets of modules to different kinds of users.
- yes, you can create your own modules, and hook them in by adding something like to following to your shell startup:
export MODULEPATH=$MODULEPATH:~/modulefiles/
- (or, if only used sometimes, alias something to module use --append ~/modulefiles)
- ...though this makes less sense on clusters)
By example
using
Writing module files
Bending the rules
Can I get the result of a command within a modulefile?
Yes, though it's advisable to keep this minimal, and/or read things like [2] on making this more robust.
Example:
set PROCESSORS [exec cat /proc/cpuinfo | grep rocessor | wc -l]
My software says to source a script, can I just do that?
There are roughly two reasons you might not want to.
- it will not work on all shells
- you can't unload that
If you don't care about unloading or shells beyond the one it's made for, or it's too annoying to transplant what it does to the module file (note there are tools to help here), then here's how to cheat:
puts stdout "source /path/to/script;"
This works because stdout is the child shell (which is where envmod is also sending things).
Note that 'source' command should work if that shell is bash or csh (but will only source successfully if what's sourced fits the shell), not in many others.
Can I automatically load other modules?
Yes, you can cheat by sending module load commands to the shell.
puts stdout "module load thing"
Try to avoid doing this more than necessary, because it can get you into conflicts that are confusing to users, and you may not really be able to solve within envmod.
Note also that in shell scripts you can do things like:
module is-loaded foo || module load foo
Can I print things towards users?
Yes. (Note you muse use stderr because stdout is the child shell)
puts stderr "echo foo"
Note that if you want it to print only during load operations, you'll want something like:
if { [module-info mode load] } { puts stderr "echo foo" }
Avoid doing anything more than printing messages, unless you understand the internals of module loading logic.
Versioning
Dependencies and conflicts
setting up
more notes
See also
- http://modules.sourceforge.net/
- https://uisapp2.iu.edu/confluence-prd/pages/viewpage.action?pageId=115540061
lmod
Newer variation on environment modules, with a few more features.
Based on Lua.
https://lmod.readthedocs.io/en/latest/
direnv
(not to be confused with dotenv, and its .env)
direnv gives a shell directory-specific environments.
When you hook in direnv into your shell (e.g. ~/.bashrc if you use bash),
then every directory change adds a check for .envrc. If that contains things like:
PATH_add ~/myscripts PYTHONPATH=~/mymodules
it would run a new shell(verify) that adds the things mentioned on that shell.
You can also hook in executables.
This is not recommended unless you need it, and it's still recommended you avoid side effects, and accept that sometimes it will make the prompt slow.
Upsides
- specific projects can be made to automatically get a proper environment - well isolated if you cared to do that
- makes it easy to have environment modules, virtualenvs etc. without any typing
- makes it easier to put the above into code versioning
- ...as it's already in a file in the same directory
- potentially extending to automated package setups
- consider e.g. integration with nix
- can also unload variables
- and seems smart enough to restore the values that was set before
Limitations
- anything not a shell will not pick this up - think of cron, services, scripts, exec*() from programs
- workarounds usually amounts to "run a shell that runs direnv that runs the thing that needs this environment"
- which can be fine for things like services and cron
- which can be awkward when you have to hardcode it (e.g. the subprocess case)
nix
Practically, you can use nix for as much or as little as you want.
It seems a lot of people just use it for a predictable, portable virtual environment for their projects.
...for development possibly integrated with direnv so that changing to a project dir automatically gets you the installs and the virtual environment that project needs
The abstractions behind it, or even the fact that it has its own package store, is not something you necessarily care about.
What is it? What does it give me?
Nix has its own package store, and its own way of pulling them into each project.
The fact that it resolves them only within each environment means having different versions in each project becomes a non-issue, and is a large part of avoiding installing into a single dependency graph of modules that becomes less and less solvable as it grows (you can still create package dependency hell but only isolated to each project).
Beyond that,
- it also supports non-destructive and atomic updates to that environment
- ...also meaning you can roll each update update back (and it's well defined what that actually does)
- also makes it a lot easier to try things out in a way that has zero effect on your system once removed
- can be used as a build tool (has to be for its own packages)
- builds happen to be deterministic, making it easier to parallelize those builds without side effects
- ...plus some further implementation details you may or may not care about
If you need most of those things, the more it can be a singular tool cleaner than the set of existing tools you might replace,
If you don't need any of that, it's overkill, but may still be nice.
On the more technical side (that you may care about less), it also makes a big point of being 'purely functional' meaning its own builds are immutable, and free of side effects, and never pick up stuff from the filesystem implicitly.
Limitations and arguables
If you want to understand it thoroughly there is a steep learning curve. You can make things (unnecessaruily) hard on yourself.
The command line is pretty spartan. It will not hold your hand.
Nix commands may be a bit slow - various take a handful of seconds
Builds can take hours
- in particular when we are actually wrapping other, granular, packaging system (like JS)
- If you approach it like "each must be a nix package to keep versioning well defined", which is sort of the whole point, then each JS package needs a full build of everything it depends on.
- in practice, you may end up making nix packages for the major components, and whatever code you threw at that. This defeats some of the well-defined-versioning point of nix, but even then it may still be a nice build tool.
Nix wants a big cache of results, or will take that time each time
- that cache doesn't carry to other systems - so initial nix builds still do take a long time. Which isn't great inside containers, where it means "every time".
- that build cache can grow large, for reasons similar to why docker build caches can get ridiculous
Nix doesn't integrate so well with services(verify) due to the way those are run.
Your security audits may be a little messier with nix in place
- the build stuff means it's hard to evaluate code in isolation
- builds are deterministic only when using versioning
- you're still trusting external code, via nixpkgs or your own
- and when that wraps another package manager, it's
It's yet another system, introducing yet another layer of abstraction
- any complexity it introduces better not be structural complexity,
- we better not make the abstraction leaky by common practice,
- we better have though of all the problems, rather than just pushing the problem around
It only solves dependency issues if you're precise about it
- (say, there's a reason that a lot of Dockerfiles don't build today, which has nothing to do with docker itself)
- and nix requiring you to be precise is sometimes also the reason you can't cheat your way around problems
- that problem may be "no solution"
- ...this is a classical tradeoff in depdendency systems, that tends to be a reason people abuse it as much as necessary
Technically, nix can refer to
- a bunch of tools
- userspace installs - users have distinct stores, and installing into your own profile(/project) is easier
- a language that lets you specify builds/dependencies [3] for the nix tools to consume.
Additionally/optionally, there are
- NixOS - seems to be an attempt to do system packages using Nix, which basically makes it its own software distribution.
- You'ld almost call it a linux distribution - though it runs on MacOS too
- NixOps[4] to deploy on multiple hosts
nix-env
nix-shell
nix glossary
nix expression language
nix-daemon
nix-daemon is required for multi-user installs, running build actions of users.
It performs build actions and other operations on the Nix store on behalf of non-root users. Usually you don't run the daemon directly; instead it's managed by a service management framework such as systemd.
https://nixos.org/manual/nix/stable/command-ref/new-cli/nix3-daemon.html