Linear algebra library notes

From Helpful
(Redirected from BLAS)
Jump to: navigation, search


BLAS (Basic Linear Algebra Subprograms)

a specification for numerical linear algebra - matrix and vector stuff
abstract, so allows optimization on specific platforms
...and BLAS-using code doesn't have to care which implementation ends up doing the calculations
(origins lie in a Fortran library)

There are quite a few implementations. Some of the better known ones:

AT for 'Automatically Tuned' - basically compiles many variants and finds which is fastest for the host it's compiling on
Doesn't make as much sense as a binary package
makes it portable - a works-everywhere, reasonably-fast-everywhere implementation.
apparently has improved recently, now closer to OpenBLAS/MKL
specifically tuned for a set of modern processors
(note: also covers the most common LAPACK calls(verify))
also quite portable -- sometimes easier to deal with than ATLAS
Specific to ~2002-2008 processors. Very good at the time, since merged into OpenBLAS?

  • MKL, Intel Math Kernel Library
(covers BLAS, LAPACK, and some other things that is sometimes very convenient to have)
known to be quite fast for the Intel processors it is tuned for (best of this list on some operations)
  • ACML, AMD Core Math Library
Comparable to MKL, but for AMD
and free
(apparently does not scale up to multicore as well as MKL?)

  • Apple Accelerate Framework
includes BLAS, LAPACK, and various other things
easy choice when programming only for apple, because it's already installed (verify)


Functionally: In comparison with BLAS, LAPACK solves some more complex, higher-level problems.

Also, LAPACK is a specific impementation, not an abstract spec.

It contains some rather clever algorithms (in some cases the only open-source implementation of said algorithm).

In other words, for some applications you are happy with just BLAS, in some cases you want to add LAPACK (or similar).

Speed-wise: LAPACK is a modern, cache-aware rewrite of LAPACK (replaces LINPACK and EISPACK) so typically faster than them.


Pragmatically: a slightly-slower and slightly-more-portable alternative to FFTW or others

As far as I can tell FFTPACK does not use AVX, which means that in some conditions (mostly larger transforms), FFTW (≥3.3), MKL, and such can be more than a little faster.

See also:

And perhaps: