EM software notes

From Helpful
Jump to: navigation, search
Notes related to Electron Microscopy

EM software notes · EM file format notes · Other EM notes

Contents

Install notes

(Assumes Ubuntu/Debian for package installation)

Dependencies

Installing dependencies for most of the things mentioned below, in ubuntu:

sudo apt-get install \
  tcsh csh \
  openmpi-bin libopenmpi-dev \
  gcc g++ gfortran gobjc build-essential cmake subversion \
  libpng-dev libjpeg62 libfftw3-bin libfftw3-dev \
  libgnomecanvas2-0 libfreetype6-dev libxml2-dev \
  libquadmath0 libxslt1-dev  libxss-dev libgsl0-dev \
  libx11-dev libxft-dev  libssl-dev libxext-dev libxml2-dev libreadline6 \
  python-numpy python-scipy python-pmw python-dev libglew-dev python-wxgtk2.8 \
  openjdk-7-jdk \
  gnuplot pdftk \
  nvidia-cuda-toolkit


Potential conflicts

Some of these packages break common library and path conventions, which sometimes break each other, and sometimes even the system.

So I like to isolate them. Basically, have only one of these overrides in a specific shell (because few users mix them).


Isolating can also be nice to switch users between multiple versions, e.g. when upgrading without "yeah, hang on for an hour" explanations.




Potential conflicts happen mostly when software

  • adds libraries that are already present, and with more prescedence than whatever is there
use of LD_LIBRARY_PATH (rather than ld.so config) is this, sort of per definition
it usually shouldn't lead to problems, but in some cases it can

which you can solve in part by putting these entries in the ld.so configuration instead.


  • adds commands that are already present, and with more prescedence than whatever is there
most commonly adding its own python install and PYTHONPATH (without virtualenving)


Potential trouble cases (based purely on what its init scripts do) include:

  • EMAN2 (overrides with its own python)
  • Xmipp (overrides with its own python)
  • Scipion (overrides with its own python)
  • IMAGIC (overrides with its own MPI)
  • Spring (overrides with its own python)
  • Dynamo (potentially with a real matlab, or is it fine?)
  • IMOD (LD_LIBRARY_PATH)


One easy way to make the initialization of these exclusive, that users may understand well enough, is to tell them e.g.

the following software is available once activated
  EMAN2         activate-eman2
  Scipion       activate-scipion
  IMAGIC        activate-imagic

Where, in bash,

function activate-eman2 {
  # the actual sets / sources, in this case
  source /usr/local/EMAN2.12/eman2.bashrc
}

User convenience note: In cases where you the first command is very predictable (e.g. shell-based cases like IMAGIC), you can actually do this transparently on the first run, e.g.

function i {
  # the actual sets / sources
  source $IMAGIC_ROOT/env
 
  # change i to now be the actual start, and start it 
  alias i=$IMAGIC_ROOT/imagic.e
  i
}

ACE2

Unpack.

There is a binary, but you probably want to
make
it so it links with your installed version of objective C.

...and install objective C if you didn't have it.


appion-protomo

bsoft

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Binary install seems simpler than compilation.

Install instructions vary between versions, so make sure you're following the right ones.

Mostly it has some hardcoded dependencies, which in ubuntu seem to be:

apt install libxml2 libgomp1 tk

benv generates its own file to be sourced



Caltech Tomography Database

Install

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Has optional separation between hosts that:

  • do storage (only real requirement is visibility from all the below)
  • run MySQL to store dataset details
  • do data processing
  • web-serve the interface
...and you can run two webservers, to have a split between "all datasets" and "datasets easily viewed/shared outside"


Installation is broadly: Storage host:

  • have space
  • share space space (probably CIFS, maybe NFS. If NFS, consider/plan UID details before you start)
  • make others mount it (at
    /tomography/data/
    or elsewhere if you change that in the scripts)
If you want to use the IMOD "Grab with notes" plugin, create the
Caps/
directory under that

Server (assuming web and database on the same):

  • database:
    • install MySQL
    • import
      DB-tomography.sql
      (creates database called tomography)
    • create mysql user for web interface and give it access to this database
    • manually add groups, users, names of microscopes, acquisition software (see docs for details why and how)
(optional: install phpmyadmin to make this easier
  • install web server (probably apache) and PHP
  • serve interface PHP scripts
    • make sure
      tmp
      and
      workbox
      subdirectories exist and are writable (They extract with 777 permissions so should be fine)
    • edit
      msql_connect.php
      to reflect the MySQL login you just made


Processing host

  • set up a user login to set up jobs - physical access and/or SSH
  • install Python/MySQLdb, ffmpeg, convert (imagemagick); IMOD (implies RAPTOR), Bsoft
Note: on RHEL, a static build of ffmpeg seems to be easier (or use RPM Fusion)
  • installing PBS is recommended to run multiple jobs sequentially
...though not really necessary - you could just run the script directly
  • create
    /home/db_proc/
    and make it writable (this is temporary files, and only leaves files for failed jobs)
  • check that the scripts can write to the data directory
i.e. run with and/or without PBS
  • unpack
    pipeline.tgz
    somewhere
    • tweak
      db_inc.py
      • set MySQL connection data to your db/username/pass
      • set storage directory (default is "/jdatabase/tomography/data/")
  • look at
    db_proc.py
    • if no PBS, it needs some tweaking (what exactly? (verify))
    • if using real-time mode, look whether you're happy with the the timeout
  • broadly tweak
    db_proc.pbs
    for your setup, then tell users to use it as a template
    • alter
      exepath
      to reflect where you put the scripts
    • suggestions for site-specific details, e.g. valid values of software_acquisition
(there's also a .pbs for reconstruction, you can ignore that one until you need it)


Assuming Ubuntu and that you're putting all parts on the same host, a good start is:

sudo apt-get install apache2 libapache2-mod-php php-mbstring php-mysql mysql-server phpmyadmin python-mysqldb \
  ffmpeg imagemagick \
  torque-server torque-mom torque-pam

For torque PBS, see e.g. https://jabriffa.wordpress.com/2015/02/11/installing-torquepbs-job-scheduler-on-ubuntu-14-04-lts/


These scripts seem to be tied to its initial install. They will need changes to work elsewhere, e.g.

hardcoded hostnames
fixed by changing names or removing "am I running here?" checks
assumption PBS provides other tools in the PATH
will also break without IMOD_DIR being set. fixed via sourcing a file specific to your setup
hardcoded paths to various tools
fixed by looking around, mostly in the db_proc*.py Could remove absolute paths if you're fixing via a file you source.
RAPTOR may require a path to itself? Not sure how IMOD is supposed to work here...
assumptions that turn out to be fragile
e.g. check_mrc breaks with some bsoft versions due to it reading text output and that having changed mildly.

Most of that can be done in db_proc.py / db_proc_mc.py

Use and testing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

Once set up, most browsing interaction is done via the web interface.

A single dataset('s pipeline) is started va the shell, though.

  • logging into the processing machine
  • (if you don't have your storage place mounted) uploading data involves getting a script that runs rsync or such
  • copy template .pbs script to yourprojectname.pbs
probably to a directory specific to this, purely to avoid a mess
edit it for the specific project
INPUTDIR
="/where/to/look/for/mrc/files"
tomo_date
="yyyy-mm-dd"
wait_for_new_tilt
=1 to keep waiting for new files (u[p to a certain configured time), or =0 to terminate after a single pass
userid
- an existing user ID (simple setups will have just one, in which case it can be in the template and you can just leave it)
collection and reconstruction parameters
  • qsub yourprojectname.pbs
  • Check for results in interface's Inbox


Notes:

  • after a tilt series (mrc) is processed, it will be moved under INPUTDIR/Done (verify)
unsuccessful RAPTOR outputs are left under /home/db_proc/ so you should check this occasionally

See also


ctffind

Keep in mind that the latest (currently 4.1) may be in beta and not on the general download page.

See also:

Chimera

The linux version comes as a self-extracing archive, which asks you

  • where to extract
  • whether to register a .desktop link and icon (apparently using its bin/xdg-setup)
  • whether to create a link in a common bin directory


Diplib

The easiest way to always have diplib run at startup is to find
toolbox/local/matlabrc.m
and add:
addpath('/opt/dipimage/common/dipimage')
dip_initialise


Some shared objects will have to be where Matlab looks. If you only do the above, it'll say where.

Dynamo

Download: https://wiki.dynamo.biozentrum.unibas.ch/w/index.php/Downloads

Install:

https://wiki.dynamo.biozentrum.unibas.ch/w/index.php/Installation
and its README_dynamo_installation.txt

Using the GPU means linking against your installed CUDA, roughly:

edit cuda/makefile to update CUDA_ROOT and CUDA_LIB (on my system it's
/usr
and
$(CUDA_ROOT)/lib/x86_64-linux-gnu/
respectively)
run
make motors
(I have this problem, which went away when I made the active gcc 5 instead of 7)

See also:

EMAN2

≤2.12

After files are installed, it runs
eman2-installer
to create the shell script you should source.

This also seems to alter the hardcoded paths in the hashbangs, so you should re-do this if you move the installation.


=2.2

Add its bin to the PATH (the installer asks whether to do so in your .profile)


See also

Gautomatch

Apparently no builds since CUDA 8.0


gCTF

Statically compiled against specific CUDA libraries, so seems to break CUDA's backwards compatibility.

Interestingly, when run with the wrong CUDA it will neither complain or (necessarily) fail - it may happily give wrong answers instead.

So make sure you have a supported CUDA; if you don't, tough cookies.

Seems to not be under much development - which combined with the CUDA thing above means it will not run on some newer sytems.


See also:

gEMpicker

Apparently no builds since CUDA 7.5, but can be run without it.


http://gem.loria.fr/gEMpicker/index.php

goCTF

https://www.lsi.umich.edu/science/centers-technologies/cryo-electron-microscopy/research/goctf

ImageJ / Fiji

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)



What's the difference between ImageJ1, ImageJ2, and Fiji?

(Or indeed these?)

IJ2 is a redesign of what you'ld now call IJ1, mostly for a better scripring/plugin environment, and is the core of what you now run(verify)

While there's a new GUI design, the default UI is still IJ1's.(verify)


Fiji seems to refer to both a collaboration and a distribution, of bare ImageJ, plus by default a modest selection of plugins.

The distinction sometimes seems fuzzy - e.g. the update site Fiji uses is hosted on ImageJ.

Roughly speaking, if you use use the updater, you're using Fiji.


See also



Dependencies

Only really depends on java, but the details are fuzzy to me so far. It seems to be made on sun/Oracle Java but should generally work fine on OpenJDK.

Also you might prefer a download with a java runtime included. You can use your system Java but this can be a little more finicky. ImageJ has a launcher (which itself is a binary, e.g. ImageJ-linux86 is an ELF), which finds Java runtimes. The way it does this seems to have some hardcoded details - that break on some systems with a different-enough Java flavour and/or version.

In particular, Java8 seems the most recent it'll play nice with, for reasons you probably don't care about.

Most startup messages (PermSize=128m, incremental CMS, illegal reflective access) are warnings and can be ignored.

The -Xincgc means you're using Java>=8 and may need to specify --default-gc (verify)

If you want to force it to a specific Java, hand it a
--java-home




Update

Update can be automated, see https://imagej.net/Updater

Install can be automated too, see bootstrap.js on that page.


The update functionality (GUI + the CLI stuff mentioned on bootstrap(verify)) is also is in the main executable.

That means you can do background updates with:

ImageJ-linux64 --update update

In theory you could put this into startup script so that every launch is latest, but since it takes takes ~5-15 seconds even when it has nothing to update (and easily 30+ when it does), consider putting that on a nightly cronjob instead.


It also means you could remove update functionality entirely, which can be useful if you do a nightly-build thing.


See also:




There are docker images, see https://hub.docker.com/r/fiji/fiji/

IMOD

If you want control over where it goes, run the installer like:

bash imod_4.9.0_RHEL7-64_CUDA6.5.sh -dir /opt -name imod-4.9.0 -skip 

Where

-dir is base directory, where the following is created under (defaults to /usr/local)
-name is handy if you are versioning different installations
-skip means it won't link it into /etc/
...which I do because I want to manage this myself - amounts to:
export IMOD_DIR=/opt/imod-4.9.0
source $IMOD_DIR/IMOD-linux.sh
note: the sourced file will take IMOD_DIR from the environment (when not set it will fall back to the path that it originally installed under)
before ~v4.9 or so it required tcsh (even if you already have csh)


You can edit cpu.adoc to your need. If managing a lot of versions, you may or may not want to share the adoc directory between them.

note that if you mostly just set the amount of cpus, you can get that to be host-agnostic on linux by at some point doing
export IMOD_PROCESSORS=$(cat /proc/cpuinfo  | grep rocessor | wc -l)
in clusters and other tightly controlled environments you need more control than that, though


Notes:

  • Install notes suggest that the most predictable environment for IMOD binary is Redhat/fedora, in a container/VM if you don't naturally run that.
it seems perfectly fine on debian / ubuntu, though CUDA details are a little different.
  • if you tightly control directories (e.g. when using environment modules), you might want to avoid sourcing that file. What it does amounts to:
set IMOD_DIR
set IMOD_JAVADIR (should contain bin/java, so might be /usr)
set PATH to include $IMOD_DIR/bin
set LD_LIBRARY_PATH to include add $IMOD_DIR/lib
set IMOD_PLUGIN_DIR=$IMOD_DIR/lib/imodplug
set IMOD_CALIB_DIR
source $IMOD_CALIB_DIR/IMOD.sh if it exists (on a clean install it doesn't)
set IMOD_QTLIBDIR=$IMOD_DIR/qtlib
set FOR_DISABLE_STACK_TRACE=1 so fortran is quieter
(on bash only) make a
function subm () { $IMOD_DIR/bin/submfg $* & }
- but note there's a bin/subm that will work if this function doesn't exist(verify)
(...most of these things are skipped when already set/included)
  • If the installer tells you
    Syntax error: "(" unexpected
    , this is due to the installer assuming that /bin/sh is specifically bash (on some distros it's dash)
You can work around this by explicitly running with bash as above, or editing the first line to say #!/bin/bash


See also:


Config

Via files

One alternative is to store host-specific config in files.

It looks in the directory set in $IMOD_CALIB_DIR

It defaults to /usr/local/ImodCalib, unless it was set already (this logic is in the file you source)
that default directory isn't created at install time, so you may need to
then copy things there
this is mostly about cpu.adoc -- which you'll want to use parallel processing and/or GPU
see also http://bio3d.colorado.edu/imod/doc/man/cpuadoc.html#TOP


Via environment

Instead of using cpu.adoc you can also set many things via environment variables, e.g.

export IMOD_DIR=/opt/imod
export IMOD_CALIB_DIR=$IMOD_DIR/autodoc
export IMOD_PROCESSORS=16
source $IMOD_DIR/IMOD-linux.sh

While you can't do as much with this, it can be simpler if all you need to set are amount of CPU cores



If you want GPU processing,

  • make sure CUDA is installed and working (recent IMOD seems to want ≥ CUDA 6 (verify))
  • set gpu.device=1 in cpu.adoc, or use the equivalent environment variable (0 seems to refer to best?)
  • or e.g. gpu.device=1,2,3,4 if you have four; see also [1]
  • make sure the device is accessible by the effective user. chmod 666 /dev/nvidia* is probably easiest (SUID root should also work)
  • testing:
gputilttest
If it fails, look at the log


PEET

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

See http://bio3d.colorado.edu/ftp/PEET/INSTALL.TXT


Requirements are basically

  • a copy of the Matlab Compiler Runtime that the version was built against. Should be in the same download directory.
  • reachable IMOD
(sometimes relatively recent for the version, e.g. PEET 1.13 wants IMOD 4.10.2. See INSTALL.TXT for specific requirement)
(...and, implicitly, some of its requirements)


The install itself is roughly:

Unpack, e.g.:

cd /opt
mkdir PEET 
cd PEET
wget http://bio3d.colorado.edu/ftp/PEET/linux/Particle_1.13.0_linux.tgz
tar xvf Particle_1.13.0_linux.tgz
wget http://bio3d.colorado.edu/ftp/PEET/linux/ParticleRuntimeV95_linux.tgz
tar xvf ParticleRuntimeV95_linux.tgz


Edit
Particle/particle.cfg
  • edit the MCR_ROOT line, as it it hardcodes the path to ParticleRuntime
  • It putting its libraries in front of the existing LD_LIBRARY_PATH can break other programs, you may want to change that.


Edit
Particle/Particle.sh
(or the variant you use)
  • edit the PARTICLE_DIR line, it hardcodes its own path


Ensure the relevant users will source this last file.

parallel processing, and multi-node processing

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Parallel execution is done by processchunks (usually run via a GUI):

"processchunks will process a set of command files in parallel 
on multiple machines or processors
that have access to a common filesystem.
It can also be used to run a single command file on a remote machine."


For local parallel processing, use cpu.adoc that at least has number= saying how many parallel processes to run.


Running on multiple hosts amounts to:

cpu.adoc with a section for each host (see the cpuadoc man page for more details)
passphraseless ssh logins to each (same user(verify))
a shared filesystem
(and, if it are mounted at different places on different hosts, telling IMOD about that. See the "Mount Rules for Local to Remote Path Translations" section in the cpuadoc man page)


It remotely runs commands wrapped in something like:

ssh hostname bash --login -c '"command"'

...which means you may want to place your sourcing of imod environment in a profile.d file for convenience. See also [2]

See also:



On windows

localrec

Seems to leverage relion and scipion (though not relion via scipion - it e.g. wants to see relion on the path, which is not something scipion does itself).

Wants to

like
scipion run relion_localized_reconstruction.py
(probably start with -h)

See also:


motioncorr

MotionCor2

Binary, fairly static compiles.


Older versions depend on libtiff3, which is too old to be available in some distros. Usually it works well enough to copy the .sofile alongside (see also libtiff naming).

Also a potential issue with libjbig, depending on distro.

Newer versions require CUDA 8 minimum, which can be a pain on older systems.


Note also the docs say "We recommend 128 GB or more CPU memory, although 64 GB CPU memory is also workable."


https://msg.ucsf.edu/software


NovaCTF

Requires fftw3, and gcc ≥4.9

https://github.com/turonova/novaCTF/wiki/Install


https://github.com/turonova/novaCTF


Pymol

Wants admin rights to install into system dirs

Pytom

  • Install dependencies
  • Get the source
git clone --depth=1 http://git.code.sf.net/p/pytom/gitrepo pytom

  • Compile:
cd pytom/pytomc
# NOTE: needs to be tweaked to your system:
./compile.py \
  --includeDir /usr/include/python2.7 /usr/lib/openmpi/include /usr/include/boost/ /usr/lib/python2.7/dist-packages/numpy/core/include/numpy/ \
  --libDir /usr/lib/openmpi/lib /usr/lib /usr/lib/x86_64-linux-gnu/ \
  --dirPath /usr/lib/openmpi/bin \
  --pythonVersion 2.7 \
  --target all
  • See if setup will work
cd pytom/pytomc
./check.py

(in my case I got the word too long issue because of a long PATH - it seems pytom copies the current path into bin/paths.csh - so edit that to remove a bunch of stuff that pytom doesn't need anyway)


See also:


Relion 3

For specific versions, see https://github.com/3dem/relion/tags


Relion has to be compiled, which may be relatively painless, sometimes less so.

install prerequisites, on debian/ubuntu amounts to something like:

sudo apt-get install cmake cpp-6 gcc-6 g++-6 autoconf automake libtool \
   openmpi-bin libopenmpi-dev libx11-dev g++ libtiff-dev libfltk1.3-dev libfftw3-dev tk-dev

(It seems to want it to compile FLTK, which seems to imply you'll need xorg-dev, but fltk-dev seems to pull in anyway. Also, I've removed build-essential because it seems to imply gcc-7)


build

cd /opt/relion-src/  # wherever you unpacked it
mkdir build
cd build
cmake ..             # but there are tweaks you probably want - SEE BELOW
make -j 8            # means 'use 8 CPUs for compilation when you can', doesn't matter so much but hey
make install



Once built,make usable usable by putting binaries and libaries where they can be found, e.g.

export PATH=$PATH:/opt/relion-src/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/relion-src/lib

Optional but useful: it's rather convenient when relion fills in where certain executables are via environment variables, e.g.

RELION_MOTIONCORR_EXECUTABLE
RELION_MOTIONCOR2_EXECUTABLE
RELION_CTFFIND_EXECUTABLE
RELION_GCTF_EXECUTABLE

...and more, see [3] for a list These do not have to be full paths if they are on the PATH (verify).




Build hints and issues

Relion build breaks in exciting ways when using gcc 7, so force 6 or 5. Due to dependencies, package management may not always want to remove 7, in which case the only way to do it is to add:

-DCMAKE_C_COMPILER=/usr/bin/gcc-5 -DCMAKE_CXX_COMPILER=/usr/bin/g++-5


Consider installing into a specific path (rather than to system directories, which is default), e.g. to keep track of distinct builds, and toggle between them, without getting horribly confused with versions. I add something like:

-DCMAKE_INSTALL_PREFIX=/opt/relion-3.0b


MPI

  • basically, try to run
    mpirun
    and
    mpiCC
    . If you get "file not found" or such, install the relevant packages.
  • if the build process can't find MPI and it is there, this seems to be a cmake change/bug [4] and you want and older one, like 3.9.6 (Copy the contents into /usr, no compilation of it is necessary). Ubtuntu 18 has a newer one so you'll need to do this until this is fixed in relion.
  • Also, be careful of multiple version of MPI (or libtiff) being installed. I've e.g. had trouble with EMAN2 and bsoft messing things up.


CUDA - installed and working.

CMAKE_NVCC_FLAGS:STRING=-D_FORCE_INLINES
  • It may get confused if you have multiple CUDA installs -- and not build GPU features as a result. So pay attention.
  • compute capability defaults to 3.5, for compatibility. You can set it higher (e.g. -DCUDA_ARCH=52) though I've not checked how much difference it makes.



See also:

ResMap

Dependencies are python, numpy, scipy, matplotlib

The current latest ResMap uses some older style numpy-isms that will fail on the most recent numpy.


Binaries: http://resmap.sourceforge.net/ / https://sourceforge.net/projects/resmap/files/

Source: https://sourceforge.net/projects/resmap/files/src/

Manual: http://resmap.sourceforge.net/files/ResMap-manual.pdf

Scipion

Scipion is a package from the Spanish CNB (Centro Nacional de Biotecnología) in Madrid.

It tries to wrap and integrate various major EM software packages, in a pluggable way. XMIPP is now (only) part of this package.


Installing from source

You can download a binary, or build from source. Source is the way to get the latest.


Install build dependencies

# for debian/ubuntu it's roughly
sudo apt-get install gcc g++ cmake libtiff-dev libxft-dev libssl-dev libxext-dev libxml2-dev libreadline6 \
 libquadmath0 libxslt1-dev  libxss-dev libgsl0-dev libx11-dev gfortran libfreetype6-dev \
 libopenmpi-dev openmpi-bin openjdk-8-jdk 

(possibly specific gcc/g++ versions, see notes below)


Get the source

git clone https://github.com/I2PC/scipion.git scipion-src


Install / update Go to the directory you downloaded/installed it

cd /opt/scipion-src


First time: prepare build

./scipion config
edit config/scipion.conf (usually the only thing you need to check, at least on a new install)
e.g. MPI, java, and maybe CUDA details
In some cases there are other configs to edit, e.g. config/hosts.conf if you're running on a cluster, and config/protocols.conf if you already knew you need to (not on standard installs)


Build

./scipion install -j 5
Note: Scipion is tied to xmipp, and installs it by default. If you want to separately build your own xmipp (e.g. to keep up to date with nightlies), do that first, then point scipion at that xmipp, like
./scipion install --with-xmipp=/opt/my-xmipp


Most later updates to this instance of scipion should amount to:

git pull && ./scipion install -j 5


Installing OR hooking in software

It's sort of the point of scipion that beyond xmipp you also get various other software, so most people will want (scipion) to install additional sofware.


Scipion 1:

e.g.

./scipion install motioncor2 ctffind4 Gctf Gautomatch relion frealign eman simple spider dogpicker localrec resmap summovie unblur


Scipion 2:


Build issues

On MPI:

MPI_LIBDIR should point at the directory containing the relevant (what exactly? libmpi.so?(verify))
MPI_INCLUDE should point at the directory containing the relevant mpi.h (verify)
MPI_BINDIR should point at the directory containing mpirun (verify)
This will vary between systems. The defaults seem right for typical Centos. For Ubuntu you probably want, respectively:
/usr/lib/x86_64-linux-gnu/openmpi/lib/
/usr/lib/x86_64-linux-gnu/openmpi/include/
/usr/bin
beware of other (EM) software putting their MPI ahead of the system one in the PATH or LD_LIBRARY_PATH. This is a recipe for headaches.


OpenCV

If OpenCV then fails to build against CUDA, see https://github.com/I2PC/scipion/wiki/Troubleshooting

When switching to Ubuntu 16 I ran into an issue with OpenCV not finding libglib, indicated by build errors like:

libgio-2.0.so.0: undefined reference to `g_source_modify_unix_fd'
This seems to just be a library-path thing, e.g
/usr/lib/x86_64-linux-gnu/
not being in ld.so's list. Either add it in ld.so.conf(.d) (it's supposed to be in there, right?), or hack it into CMakeCache.txt


"CMake Error at cmake/OpenCVDetectCXXCompiler.cmake:89"

Seems to mean OpenCV won't compile on recent gcc7[5] The solution right now seems to be to switch to gcc6 (probably best done via the config file)


fatal error: stdlib.h: No such file or directory

Try adding -DENABLE_PRECOMPILED_HEADERS=OFF to the compiler flags in the config file. If that doesn't work, you probably messed up your compiler environment. Or maybe switching to gcc-6 or gcc-5?



For some others, see:

Runtime issues

Java

Unrelated to scipion directly, but as of this writing, Ubuntu's build of openjdk-9 is currently causing a segfault (Apparently freetype using the JNI interface?), so you can't really use it.

I've also gotten issues with it erroring on StringConcatFactory. This seems to be compilation oddness which I've not yet figured out (maybe have no java9 on your system?), so I gave up and installed the scipion binary instead.


Motioncor2

It seems scipion's install assumed you have a specific CUDA version (or I'm missing some config).

If so, a quick fix is to rename the right executable to the (wrong) one it was expecting.


e2boxer

See e.g. https://github.com/I2PC/scipion/wiki/Troubleshooting#launching-eman-boxer-protocol

Installing from binary

Install dependencies - now roughly:

sudo apt-get install openjdk-8-jdk libopenmpi-dev openmpi-bin gfortran


Download from http://scipion.i2pc.es/download_form/

Follow instructions at https://github.com/I2PC/scipion/wiki/How-to-Install-1.1-bin which are roughly:

Untar the download
Do the
./scipion config
configuration, mostly for MPI (see notes above)
Install further software as needed (see the instructions)


Configure your environment for use

The environment hook-in seems to be:

# for scipion
export PATH=$PATH:/opt/scipion

# If you want to hook its xmipp for CLI use outside of scipion, add something like:
source /opt/scipion/software/em/xmipp/xmipp.bashrc
# (determines its location at runtime, so you don't have to bake in anything) 
# Note that the latter hooks in its own python environment, not virtuanlenv'd,
#   and before the system one. If that breaks things, you may want an activation alias (see above)


Notes:

  • Scipion also seems to assume its xmipp has a python install, while it doesn't necessarily build that (verify)




scipion-session

SPHIRE, TranSPHIRE, crYOLO

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

For context

SPHIRE (SParx for HIgh Resolution Electron Microscopy) seems to be the project containing various specific projects that aim to be part of a larger pipeline, apparently including (note that some terms are function groupings)

  • MOVIE [6] (just wraps unblur?)
  • JANNI (noise2noise)
  • Cter [7], CTF estimation
  • Window [8], groups some particle picking and extraction stuff
    • crYOLO: (automatic) picking, CNN-based particle picker based on tensorflow, CPU or GPU based
has gotten some separate attention, as a nicely performing picker
  • Cinderella: (automatic) 2D class selection [9]
  • ISAC [10], classification and alignment
  • VIPER: [11], 3D projection stuff
  • Sort3D [12], 3D clustering stuff
  • Meridien [13], 3D refinement stuff
  • LocalRES [14]

...and a bunch of tools


TranSPHIRE is a more concretely packaged on-the-fly processing pipeline.

See also:

  • sphire.mpg.de/wiki/doku.php


Install

Transphire and cryolo can be installed from pypi, and expect to be run within a given conda environment.


http://sphire.mpg.de/wiki/doku.php?id=downloads:cryolo_1&redirect=1


http://sphire.mpg.de/wiki/doku.php?id=downloads:cryolo_1

https://github.com/MPI-Dortmund/transphire

Spider

Spring

http://www.sachse.embl.de/emspring/install.html


If you use the packaged ('binary') version, then you will need to mostly just run:

sh patches/binary_install_linux.sh

Due to what this does, you'll probably need to run this on each PC you install on


Tigris

https://sourceforge.net/projects/tigris/

Xmipp

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

From around version 3.2, you can only really install it as part of scipion, though you can still mostly use it as before.


TODO

Bsoft, bfactor, coot, ctffind3, ctffind4, motioncorr, MotionCor2, MATLAB, fiji, frealign, IHRSR, protomo, relion, resmap, situs, sculptor, spider, tigris, tomoctf, CCP4, Dynamo, IMAGIC, IMOD, PEET, phenix, Pymol

See also

https://en.wikibooks.org/wiki/Software_Tools_For_Molecular_Microscopy

Task-specific notes

Gain correction

There is usually a factory flat field correction, e.g. based on dark-frame images and the per-pixel gain apparent from a few dozen or few hundred images.

This assumes that

  • the statistics of these images is enough
since they're longish exposures, ths is often true enough
  • the defects stay constant


These are reasonable assumptions, though [1] notes there are leftovers, and also non-constants like dust on the sensor. The paper's assumption is that if you have tens of thousands of images, each pixel should behave identially in terms of mean and standard deviation, and that you can correct images towards that assumption using only those statistics - using the dataset itself (assuming there are no patterns introduced by positioning, which is usually true).

One can argue about how influential these residuals are expected to be, but the correction is simple enough to do.


See also:


CAMERA-NORM-CORRECTION in IMAGIC

Movie-frame alignment

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


The beam's energy tends to induce movement in the particles (and there may be things like drift, charging).

Trying to align frames before summing them can give higher contrast and/or more information (e.g. CTF with more rings).


Software varies in assumptions and method - uniform global move or local alignment, shift-only or not, filtering to avoid near-nyquist distractions, etc.


There is some comparison in Rawson et al. (2016), "Methods to account for movement and flexibility in cryo-EM data processing"


driftcorr/motioncorr

from Yifan Cheng Lab

Called motioncorr and driftcorr (e.g. its webpage is named driftcorr and titled motioncorr, download is motioncorr*.tgz, command is dosefgpu_driftcorr (also referring to dose fractionation))

Global shift only(verify)

GPU-only.


Will need permission to the GPU, and a matching CUDA version (which one?(verify)), or will say something like "Failed to initialize GPU #0".

One way to fix the permissions:

sudo chmod 666 /dev/nvidia*

If you don't want to do that, you probably want to chown driftcorr as root:root and add SUID/SGID to give it access. (a third option would be to use sudo, but then the result files will be owned by root, which is probably annoying)


See also:

MotionCor2

(formerly known as UcsfDfCorr)


GPU-only, as of this writing built against CUDA 7.5.

Can also read 4-bit data (as e.g. UCSF Tomo can output)


See also:


Relion's

Relion 3 mentions it has a CPU-only reimplementation of MotionCor2's algorithm, apparently to let you run without requiring MotionCor2 itself.


ZORRO

Alignframes in IMOD

Global shift only(verify)


Can use CPU or GPU.

Fairly flexible set of options (filtering, binning, and whatnot).

xmipp optical flow

Does localized shift analysis(verify) (after global shift).

(since xmipp 3.2, the beta version as of this writing)

Can use CPU or GPU.

See also:


relion

Since relion is particle-oriented, it does it per-particle as an optional later step, called particle polishing

This means it does not not assume the entire micrograph has an equal shift overall, as others do. This can make more sense in that it can optimize for each particle (though can arguably also be more susceptible to mistakes, using less information).


See also:


k2align

https://github.com/dtegunov/k2align


ALIGN-MOVIE-FRAMES in IMAGIC

Global shift, optional rotate. (comes from its generic alignment code).

CPU-only.


unblur / summovie

Global shift only

Exposure-dependent filter, as explained in the artucles


CPU-only (verify)


See also

CTF estimation and correction

In packages

Various pakages have their own as part of their complete toolset, including:


Own packages:

  • gCTF (see above)
  • ctffind3, ctffind4 (see above_
  • tomoctf

Anisotropic magnification

Most methods choose to work with diffraction, i.e. crystals, because this separates it from the effects of astigmatism.

Many have specific samples. Crystalline ice can be enough, though good prep minimalizes it...


IMAGIC

A posteriori, e.g. on water rings present in the dataset (though since that's at 3.8A that's not quite trivial to do).


magdistortion

Measures on a test sample ("polycrystalline samples such as gold shadowed diffraction gratings").

See also:


correctmaganisotropy

Measures on a test sample (crystalline thallous chloride)


See also:


Some common tools

file inspection, manipulation

header (IMOD)

inspect header values


bhead (IMOD)

inspect header values


bcat (Bsoft)


newstack (IMOD)

  • stack operations:
    • copy (when using infile outfile, as in general in imod)
    • extract (when using -secs/-SectionsToRead, or fileinlist)
    • append to
  • change data type (MODE)
  • can rescale densities according to range, mean-and-stdev
  • -replace to overwrite instead of replace
e.g. to convert/rescale a file in-place
  • format conversion: can output MRC, TIFF, HDF
  • extended header stuff:
    • deal with some tile angle details
    • import some .mdoc metadata (from SerialEM)
  • some basic processing
    • binning
    • anti-aliased scaledown
    • fourier-based reduce, shift
    • crop

e.g. create stack from images

newstack frame1.mrc frame2.mrc frame3.mrc stack.mrc
bcat     frame1.mrc frame2.mrc frame3.mrc stack.mrc



clip (IMOD)

More for content processing, e.g.

  • resize
  • normalize, unpack
  • fft, spectrum
  • stats, variance, standev
  • add, average, median, logarithm, truncate, shadow, sqroot, rotx, threshold
  • divide, multiply, subtract
  • filter, prewitt, graham
  • info
  • histogram
  • correlation
  • sharpen, contrast, color, joinrgb, splitrgb, edgefill, diffusion, smooth
  • laplacian
  • gradient
  • quadrant

Some of these have both 2D and 3D variants, meaning -2d and -3d do different things.


squeezevol (IMOD)



File type conversion

Management and review tools

ISPyB

A Laboratory Information Management System from Diamond, aimed at synchrotrons.

Includes

  • Tracking shipments, sample storage
  • Acquisition monitoring
  • Review past collections
  • Reports
  • Remote access
  • iPhone/iPads app similar to basic web interface


See also:


Focus

Puts initial image processing close to collection, to allow remote monitoring, aimed at single particles, two-dimensional crystals, and electron tomography.


See also:

Sphire

http://sphire.mpg.de/index.html#gallery

Semi-sorted

https://en.wikibooks.org/wiki/Software_Tools_For_Molecular_Microscopy

EPU session notes

Scipion notes

Scipion and GPU / CUDA

More on protocols

Protocol: a processing task that involves the execution of or or more several steps part of a sensible whole

can execute Python code and/or call external programs
should handle conversions to/from scipion objects, in terms of metadata, arguments, input, output


declared input parameters are shown in the GUI
StringParam, FloatParam, IntParam, BooleanParam, EnumParam
PointerParam: points at objects from the database (verify)
RelationParam: points at select relations instead of objects (mainly used for CTF browsing) (verify)
ProtocolClassParam select protocol classes (used for Workflows, under development)



http://scipion.cnb.csic.es/old-docs/bin/view/TWiki/HowToDevelopProtocols



https://github.com/I2PC/scipion/wiki/Integrated-Protocols

EMAN2

e2boxer

e2boxercache: http://blake.bcm.edu/emanwiki/Eman2AppMetadata

box.type:

  • SwarmBoxer.REF_NAME = 'swarm_ref' (black)
  • SwarmBoxer.WEAK_REF_NAME = 'swarm_weak_ref' (blue)
  • SwarmBoxer.AUTO_NAME = 'swarm_auto' (green)
  • manual: white

PyMol notes

Selection of atoms is pretty nice

  • For the syntax and algebra, see things like [15], [16]


If you want a detailed mesh around a specific piece of chain, you often want to

  • [17] for a resampled mesh
You usually want to do an map_trim around a useful selection (e.g. a chain) first
first, because this increases the memory use eighfold (3D so 2*2*2), and if the reduction is large that saves space and time

You may also wish to play with mesh_width's value