Various transfer functions

From Helpful
Jump to navigation Jump to search


Quick and dirty

While coding I regularly want a quick weighing or distortion of data roughly the right amount - without spending hours into what it really should be in theory.


on fractions in (0,1)

roots and powers - from a straight line, we squeeze 0..1 to one side or other


Weighing low/high values (hard)


Weighing center values (hard)

Weighing center values (softer)

  • many others


Smoothstep (~= sigmoids in 0..1)

https://en.wikipedia.org/wiki/Smoothstep

on things in (0..large number)

(these are somewhat )


  • sqrt(x) (...or sqrt(x)/turning_point)
    • turning_point: scale down everything so that this value is 1.0
    • where turning point is an interesting threshold (if it makes sense in the given scale, perhaps based on an average if in somewhat arbitrary scale)


  • logs (vaguely like sqrt, but larger values have less effect):
    • since log(0)=-inf and is negative up to x=1, we want to add something to that input:
    • log(base+x) (e.g. log_e(e+x), log_10(10+x))
      • at x=0, output is 1, with slow, logarithmic curve above that (the steepest initial bit is hidden in x<0; you could play with offsets to get more of it)
    • log(base+x)-1
      • same curve as the previous one, but at x=0, output is 0
      • you could rescale it

On things around 0 (mostly but not necessarily in -1..1)

Sigmoids are a wide family, anything s-shaped


Logistic function


tanh (the hyperbolic tangent)

Some notes

Logistic function

The logistic function, a.k.a. logistic curve, is defined by:

f(x) = L / ( 1+e-k*(x-x0) )

where

x0 is the x position of the midpoint
L = the magnitude (often 1)
k = logistic growth rate - the steepness of the curve

(in specific applications these may have more intuitively meaningful names)


Note that in various uses it may simplify, e.g. in neural net use often to 1/(1+e-x, and is then often called a sigmoid, apparently both to point at the shape and that it's not quite a full logistic function.


Logit

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

In math, logit is a function that maps probabilities in 0..1 to real numbers in -inf..inf.

💤 Around statistics and probablity it is sometimes called log-odds, because it is equivalent to log(p/(1-p)), where p/(1-p) is the odds in probability.


It a transfer function largely useful to statistics and machine learning.


In machine learning, it's

sometimes just as a general idea,
sometimes used to refer to the inverse function of a (any) logistic sigmoid function(verify)


Around neural networks (an in particular in tensorflow)

  • 'logit' it is also used to refer to a layer / tensor that is a raw prediction output (probably from a much denser input) before passing it to a normalization-style function like softmax.
  • 'logits' is sometimes used to refer to the numbers that that layer spits out(verify)

Without the detailed history of why this name was adopted for this specific use, this is actually a confusing abuse of the original term.

https://stackoverflow.com/questions/41455101/what-is-the-meaning-of-the-word-logits-in-tensorflow/52111173#52111173


https://en.wikipedia.org/wiki/Logit

On sigmoid functions

Sigmoid refers to a class of curves that are S-shaped.


In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the logistic function, or sometimes tanh, but can be various others.


On activation functions