Various transfer functions: Difference between revisions

Latest revision as of 23:13, 21 April 2024

Quick and dirty

While coding I regularly want a quick weighing or distortion of data roughly the right amount - without spending hours into what it really should be in theory.

on fractions in (0,1)

roots and powers - from a straight line, we squeeze 0..1 to one side or other

x²
x^0.5 (sqrt)

Weighing low/high values (hard)

2*abs(1-2x)

Weighing center values (hard)

1-abs(1-2x)

Weighing center values (softer)

sqrt(1-abs(1-2x))
sin(pi*x)
sin(pi*x) raised to a power (lower or higher than 1), e.g.
- sin(pi*x)**3
- sqrt( sin(pi*x) )

many others

Smoothstep (~= sigmoids in 0..1)

https://en.wikipedia.org/wiki/Smoothstep

on things in (0..large number)

(these are somewhat )

sqrt(x) (...or sqrt(x)/turning_point)
- turning_point: scale down everything so that this value is 1.0
- where turning point is an interesting threshold (if it makes sense in the given scale, perhaps based on an average if in somewhat arbitrary scale)

logs (vaguely like sqrt, but larger values have less effect):
- since log(0)=-inf and is negative up to x=1, we want to add something to that input:
- log(base+x) (e.g. log_e(e+x), log_10(10+x))
  - at x=0, output is 1, with slow, logarithmic curve above that (the steepest initial bit is hidden in x<0; you could play with offsets to get more of it)
- log(base+x)-1
  - same curve as the previous one, but at x=0, output is 0
  - you could rescale it

On things around 0 (mostly but not necessarily in -1..1)

Sigmoids are a wide family, anything s-shaped

Logistic function

tanh (the hyperbolic tangent)

Some notes

Logistic function

The logistic function, a.k.a. logistic curve, is defined by:

f(x) = L / ( 1+e^-k*(x-x₀) )

where

x₀ is the x position of the midpoint

L = the magnitude (often 1)

k = logistic growth rate - the steepness of the curve

(in specific applications these may have more intuitively meaningful names)

Note that in various uses it may simplify, e.g. in neural net use often to 1/(1+e^-x, and is then often called a sigmoid, apparently both to point at the shape and that it's not quite a full logistic function.

Logit

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

In math, logit is a function that maps probabilities in 0..1 to real numbers in -inf..inf.

💤 Around statistics and probablity it is sometimes called log-odds, because it is equivalent to log(p/(1-p)), where p/(1-p) is the odds in probability.

It a transfer function largely useful to statistics and machine learning.

In machine learning, it's

sometimes just as a general idea,

sometimes used to refer to the inverse function of a (any) logistic sigmoid function(verify)

Around neural networks (an in particular in tensorflow)

'logit' it is also used to refer to a layer / tensor that is a raw prediction output (probably from a much denser input) before passing it to a normalization-style function like softmax.
'logits' is sometimes used to refer to the numbers that that layer spits out(verify)

Without the detailed history of why this name was adopted for this specific use, this is actually a confusing abuse of the original term.

https://stackoverflow.com/questions/41455101/what-is-the-meaning-of-the-word-logits-in-tensorflow/52111173#52111173

https://en.wikipedia.org/wiki/Logit

On sigmoid functions

Sigmoid refers to a class of curves that are S-shaped.

In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the logistic function, or sometimes tanh, but can be various others.

On activation functions

@@ Line 1: / Line 1: @@
+{{#addbodyclass:tag_math}}
@@ Line 23: / Line 24: @@
 * [http://www.wolframalpha.com/input/?i=plot+%281-abs%281-2x%29%29**0.5+for+x+in+%280%2C1%29 sqrt(1-abs(1-2x))]
 * [http://www.wolframalpha.com/input/?i=plot+sin%28pi*x%29+for+x+in+%280%2C1%29 sin(pi*x)]
-* sin(pi*x) raised to a power (lower or higher than 1)
+* sin(pi*x) raised to a power (lower or higher than 1), e.g.
+** [https://www.wolframalpha.com/input?i=plot+sin%28pi*x%29**3+for+x+in+%280%2C1%29 sin(pi*x)**3]
+** [https://www.wolframalpha.com/input?i=plot+sin%28pi*x%29**0.5+for+x+in+%280%2C1%29 sqrt( sin(pi*x) )]
 * many others
@@ Line 90: / Line 94: @@
 {{stub}}
-<!--
+In math, '''logit''' is a function that maps probabilities in 0..1 to real numbers in -inf..inf.
+{{zzz|Around statistics and probablity it is sometimes called '''log-odds''', because it is equivalent to  log(p/(1-p)), where ''p/(1-p)'' is the [https://en.wikipedia.org/wiki/Odds odds] in probability.}}
+<!-- [https://www.wolframalpha.com/input?i=plot+log%28+x%2F%281-x%29+%29+++for+x+in+%280%2C1%29 -->
-In math, logit is a function that maps probabilities in 0..1 to real numbers in -inf..inf.
 It a transfer function largely useful to statistics and machine learning.
@@ Line 99: / Line 105: @@
 In machine learning, it's
 : sometimes just as a general idea,
-: sometimes used to refer to the inverse function of logistic sigmoid function{{verify}}
+: sometimes used to refer to the inverse function of a (any) logistic sigmoid function{{verify}}
-In tensorflow it seems to refer to a layer / tensor that is a raw prediction output (probably from a much denser input)
-before passing it to a normalization-style function like [[softmax]].
-(and 'logits' seem to also refer to the numbers that that layer spits out{{verify}})
+Around neural networks (an in particular in tensorflow)
+* 'logit' it is also used to refer to a layer / tensor that is a raw prediction output (probably from a much denser input) before passing it to a normalization-style function like [[softmax]].
+* 'logits' is sometimes used to refer to the numbers that that layer spits out{{verify}}
+Without the detailed history of why this name was adopted for this specific use,
+this is actually a confusing abuse of the original term.
-Without the detailed history of why this name was adopted, it's actually a confusing abuse of the original term.
+https://stackoverflow.com/questions/41455101/what-is-the-meaning-of-the-word-logits-in-tensorflow/52111173#52111173
 https://en.wikipedia.org/wiki/Logit
-https://stackoverflow.com/questions/41455101/what-is-the-meaning-of-the-word-logits-in-tensorflow/52111173#52111173
--->
 ===On sigmoid functions===
@@ Line 123: / Line 127: @@
-In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the [[logistic function]], or sometimes [[tanh]], but can be various others.
+In various contexts (e.g. activation functions in modeled neurons, population growth modeling) it often refers to the [[logistic function]],
+or sometimes [[tanh]], but can be various others.
 <!--
 Generally useful as
 : population growth modeling
@@ Line 132: / Line 139: @@
 : activation functions (see e.g. sigmoid neuron)
+When you want to turn things like dot products or cosine metrics (or other things that may center on 0)
+to probabilities -- but there are justifications for many others, depending on what you're doing
+(e.g. [[softmax]])
 -->

Various transfer functions: Difference between revisions

Latest revision as of 23:13, 21 April 2024

Contents

Quick and dirty

on fractions in (0,1)

on things in (0..large number)

On things around 0 (mostly but not necessarily in -1..1)

Some notes

Logistic function

Logit

On sigmoid functions

On activation functions

Navigation menu