Softmax: Difference between revisions

From Helpful
Jump to navigation Jump to search
(Created page with " <!-- softmax (a.k.a. softargmax, normalized exponential function) takes a vector of numbers provides a vector of probabilities The point seems to often be to take values...")
 
mNo edit summary
Line 5: Line 5:
softmax (a.k.a. softargmax, normalized exponential function)
softmax (a.k.a. softargmax, normalized exponential function)


takes a vector of numbers
* takes a vector of numbers
provides a vector of probabilities
* provides a vector of probabilities  
:: all in 0..1
:: and sum to 1.0




The point seems to often be to take values on any scale and present them as 0..1  
Many reference you'll find ''now'' are its use in neural nets, where they take activation on any sort, and put them into 0..1 scale sensibly,
in what is often a final layer (in a functional block, or overall).


Crudely speaking, it is a normalization, but it's sigmoid-style because of the use of exponents.


While the exponent makes it look like some choices of sigmoid functions,


And it isn't directly comparable to transfer functions, and you can't get an easy graph of it, exactly ''because'' it takes multiple inputs.




softmax as an activation function
But also, it's a more general mathematical tool, even if it's mostly seen in machine learning.
 
 
 
 
It is ''not'' just normalization.
 
Nor is just a way to bring out the strongest answer.
Both its exponent internals and the "will sum to 1.0 part" will mean things shift around, even if you feed it probabilities in 0..1 - e.g. softmax([1.0,0.5,0.1]) ~= 0.5, 0.3, 0.2, and even if they already sum to 1.0, e.g. softmax([0.5, 0.3, 0.2]) ~= 0.4, 0.31, 0.28
 
 
When using nets as multiclass classifiers, you would need something like softmax to be able to respond on all the labels, and in a way that looks like probabilities. 
In part it's just a choice of what you want to show (you could output classification margin scores instead),
in part it's a choice that




Line 24: Line 41:


https://en.wikipedia.org/wiki/Softmax_function
https://en.wikipedia.org/wiki/Softmax_function
https://datascience.stackexchange.com/questions/57005/why-there-is-no-exact-picture-of-softmax-activation-function


-->
-->

Revision as of 16:24, 11 May 2023