Softmax: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
mNo edit summary |
||
Line 2: | Line 2: | ||
'''softmax''' (sometimes called softargmax, normalized exponential function, and other things ) | '''softmax''' (sometimes called softargmax, normalized exponential function, and other things ) | ||
* takes a vector of numbers (any scale) | * takes a vector of numbers | ||
:: (any scale) | |||
* returns a vector of probabilities | * returns a same-length vector of probabilities | ||
:: all in 0 .. 1 | :: all in 0 .. 1 | ||
:: that sum to 1.0 | :: that sum to 1.0 |
Revision as of 14:49, 13 February 2024
✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.
softmax (sometimes called softargmax, normalized exponential function, and other things )
- takes a vector of numbers
- (any scale)
- returns a same-length vector of probabilities
- all in 0 .. 1
- that sum to 1.0
Note that it is not just normalization.
Nor is it only a way to bring out the strongest answer.
The exponent in its internals, plus the "will sum to 1.0 part" will mean things shift around in a non-linear way, so even relative probabilities already in in 0..1 and summing to 1.0 will change, e.g.
- softmax([1.0,0.5,0.1]) ~= 0.5, 0.3, 0.2,
- softmax([0.5, 0.3, 0.2]) ~= 0.4, 0.31, 0.28