Neural net software notes
Broad comparison
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
Almost everything is a little annoying to install the first time
Almost everything has an initially steep learning curve
Pretty much all of them are Linux+Windows+MacOS
Theano
- probably the first out there - now no longer developed
- if looking to experiment, probably look at keras instead
- GPU: CUDA
- CPU: yes
Tensorflow
- creator/backer: Google
- currently seems to have the largest community
- moderate-level API, flexible but verbose
- varying frontends
- GPU: CUDA
- CPU: yes
- multi-node: yes
Keras
- high-level API, usable as API/frontend to TensorFlow, Theano, CNTK
- easy and clear to start with - seems to be popular for introduction
- not always as tweakable
- GPU: yes (see backends; CUDA)
- CPU: yes
- creator/backer: google
Caffe
- lowish-level API, mre work
- community focused on computer vision
- many forks, a bit confusing
- can be annoying to compile/install
- GPU: yes, but a little work
- CPU: yes
- multi-node: yes (MPI)
- C++
- creator/backer: Berkeley
Caffe2
- sort of a cleaned up and extended version of caffe?(verify)
- creator/backer: Facebook
Torch
- itself C, with lua wrapper
- CPU: yes(verify)
Pytorch wraps the torch binaries with python wrapper (and also something of a successor(verify))
- python is easier to engage with exiting code than lua
- caffe2 is merging into pytorch
darknet
- GPU: CUDA
- CPU: yes
CNTK ('Cognitive Toolkit')
- creator/backer: microsoft
- GPU: CUDA
- CPU: yes
MXNet
- creator/backer: Microsoft/Amazon
- language/wrapper: R, C, python, Js
- GPU: CUDA
- CPU: yes
Chainer
- creator/backer: IBM, Intel, Nvidia, Amazon
- python interface
- easy an fast, apparently
- GPU: CUDA
- CPU: yes
- smaller community
Deeplearning4j
- Java, so fits existing java projects well
- smaller community, though
- GPU: yes
- CPU: yes
- scaling: yes
Static versus dynamic graphs:
- Static means you have to define the graph before you run it, which is perfectly good for pre-set jobs, and sometimes faster.
- Dynamic means it can change during execution, which can be more powerful for e.g. unstructured data and RNNs - while they're not necessary for e.g. most image / CNN work.
There are also a number of adapters between frameworks, see e.g.
https://github.com/ysh329/deep-learning-model-convertor
TensorFlow notes
This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me) |
TensorFlow Lite[1] is meant for phone/embedded, and is basically just the model evaluation, and at least currently is behind in some features.
Google's TPUs is a tensor processor in an ASIC.
Used internally at first, it is now also a product in the form of a USB-connected compute stick (look for Coral Edge), which can run already-trained TensorFlow Lite models. It seems aimed at adding processing to something relatively minimal, like a Raspberry Pi.