Various transfer functions: Difference between revisions
m (→Logit) |
mNo edit summary |
||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{#addbodyclass:tag_math}} | |||
Line 93: | Line 94: | ||
{{stub}} | {{stub}} | ||
In math, logit is a function that maps probabilities in 0..1 to real numbers in -inf..inf. | In math, '''logit''' is a function that maps probabilities in 0..1 to real numbers in -inf..inf. | ||
{{zzz|Around statistics and probablity it is sometimes called '''log-odds''', because it is equivalent to log(p/(1-p)), where ''p/(1-p)'' is the [https://en.wikipedia.org/wiki/Odds odds] in probability.}} | |||
<!-- [https://www.wolframalpha.com/input?i=plot+log%28+x%2F%281-x%29+%29+++for+x+in+%280%2C1%29 --> | |||
It a transfer function largely useful to statistics and machine learning. | It a transfer function largely useful to statistics and machine learning. | ||
In machine learning, it's | In machine learning, it's | ||
Line 125: | Line 127: | ||
In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the [[logistic function]], or sometimes [[tanh]], but can be various others. | In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the [[logistic function]], | ||
or sometimes [[tanh]], but can be various others. | |||
<!-- | <!-- | ||
Generally useful as | Generally useful as | ||
: population growth modeling | : population growth modeling | ||
Line 134: | Line 139: | ||
: activation functions (see e.g. sigmoid neuron) | : activation functions (see e.g. sigmoid neuron) | ||
When you want to turn things like dot products or cosine metrics (or other things that may center on 0) | |||
to probabilities -- but there are justifications for many others, depending on what you're doing | |||
(e.g. [[softmax]]) | |||
--> | --> |
Latest revision as of 23:13, 21 April 2024
Quick and dirty
While coding I regularly want a quick weighing or distortion of data roughly the right amount - without spending hours into what it really should be in theory.
on fractions in (0,1)
roots and powers - from a straight line, we squeeze 0..1 to one side or other
Weighing low/high values (hard)
Weighing center values (hard)
Weighing center values (softer)
- sqrt(1-abs(1-2x))
- sin(pi*x)
- sin(pi*x) raised to a power (lower or higher than 1), e.g.
- many others
Smoothstep (~= sigmoids in 0..1)
on things in (0..large number)
(these are somewhat )
- sqrt(x) (...or sqrt(x)/turning_point)
- turning_point: scale down everything so that this value is 1.0
- where turning point is an interesting threshold (if it makes sense in the given scale, perhaps based on an average if in somewhat arbitrary scale)
- logs (vaguely like sqrt, but larger values have less effect):
- since log(0)=-inf and is negative up to x=1, we want to add something to that input:
- log(base+x) (e.g. log_e(e+x), log_10(10+x))
- at x=0, output is 1, with slow, logarithmic curve above that (the steepest initial bit is hidden in x<0; you could play with offsets to get more of it)
- log(base+x)-1
- same curve as the previous one, but at x=0, output is 0
- you could rescale it
On things around 0 (mostly but not necessarily in -1..1)
Sigmoids are a wide family, anything s-shaped
Logistic function
tanh (the hyperbolic tangent)
Some notes
Logistic function
The logistic function, a.k.a. logistic curve, is defined by:
f(x) = L / ( 1+e-k*(x-x0) )
where
- x0 is the x position of the midpoint
- L = the magnitude (often 1)
- k = logistic growth rate - the steepness of the curve
(in specific applications these may have more intuitively meaningful names)
Note that in various uses it may simplify, e.g. in neural net use often to 1/(1+e-x,
and is then often called a sigmoid, apparently both to point at the shape and that it's not quite a full logistic function.
Logit
In math, logit is a function that maps probabilities in 0..1 to real numbers in -inf..inf.
It a transfer function largely useful to statistics and machine learning.
In machine learning, it's
- sometimes just as a general idea,
- sometimes used to refer to the inverse function of a (any) logistic sigmoid function(verify)
Around neural networks (an in particular in tensorflow)
- 'logit' it is also used to refer to a layer / tensor that is a raw prediction output (probably from a much denser input) before passing it to a normalization-style function like softmax.
- 'logits' is sometimes used to refer to the numbers that that layer spits out(verify)
Without the detailed history of why this name was adopted for this specific use, this is actually a confusing abuse of the original term.
https://en.wikipedia.org/wiki/Logit
On sigmoid functions
Sigmoid refers to a class of curves that are S-shaped.
In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the logistic function,
or sometimes tanh, but can be various others.