Neural net software notes

From Helpful
Revision as of 21:34, 23 May 2020 by Helpful (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Broad comparison

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)


Almost everything is a little annoying to install the first time

Almost everything has an initially steep learning curve

Pretty much all of them are Linux+Windows+MacOS


Theano

  • probably the first out there - now no longer developed
  • if looking to experiment, probably look at keras instead
  • GPU: CUDA
  • CPU: yes

Tensorflow

  • creator/backer: Google
  • currently seems to have the largest community
  • moderate-level API, flexible but verbose
  • varying frontends
  • GPU: CUDA
  • CPU: yes
  • multi-node: yes

Keras

  • high-level API, usable as API/frontend to TensorFlow, Theano, CNTK
  • easy and clear to start with - seems to be popular for introduction
  • not always as tweakable
  • GPU: yes (see backends; CUDA)
  • CPU: yes
  • creator/backer: google


Caffe

  • lowish-level API, mre work
  • community focused on computer vision
  • many forks, a bit confusing
  • can be annoying to compile/install
  • GPU: yes, but a little work
  • CPU: yes
  • multi-node: yes (MPI)
  • C++
  • creator/backer: Berkeley

Caffe2

  • sort of a cleaned up and extended version of caffe?(verify)
  • creator/backer: Facebook

Torch

  • itself C, with lua wrapper
  • CPU: yes(verify)

Pytorch wraps the torch binaries with python wrapper (and also something of a successor(verify))

  • python is easier to engage with exiting code than lua
  • caffe2 is merging into pytorch


darknet

  • GPU: CUDA
  • CPU: yes


CNTK ('Cognitive Toolkit')

  • creator/backer: microsoft
  • GPU: CUDA
  • CPU: yes

MXNet

  • creator/backer: Microsoft/Amazon
  • language/wrapper: R, C, python, Js
  • GPU: CUDA
  • CPU: yes

Chainer

  • creator/backer: IBM, Intel, Nvidia, Amazon
  • python interface
  • easy an fast, apparently
  • GPU: CUDA
  • CPU: yes
  • smaller community

Deeplearning4j

  • Java, so fits existing java projects well
  • smaller community, though
  • GPU: yes
  • CPU: yes
  • scaling: yes


Static versus dynamic graphs:

Static means you have to define the graph before you run it, which is perfectly good for pre-set jobs, and sometimes faster.
Dynamic means it can change during execution, which can be more powerful for e.g. unstructured data and RNNs - while they're not necessary for e.g. most image / CNN work.


There are also a number of adapters between frameworks, see e.g. https://github.com/ysh329/deep-learning-model-convertor


TensorFlow notes

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

TensorFlow Lite[1] is meant for phone/embedded, and is basically just the model evaluation, and at least currently is behind in some features.


Google's TPUs are tensor processors in an ASIC.

Used internally at first, it is now also a product in the form of a USB-connected compute stick (look for Coral Edge), which can run already-trained TensorFlow Lite models. It seems aimed at adding processing to something relatively minimal, like a Raspberry Pi.


Installation