Softmax: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 2: Line 2:


'''softmax''' (sometimes called softargmax, normalized exponential function, and other things )
'''softmax''' (sometimes called softargmax, normalized exponential function, and other things )
* takes a vector of numbers (any scale)
* takes a vector of numbers  
:: (any scale)


* returns a vector of probabilities  
* returns a same-length vector of probabilities  
:: all in 0 .. 1
:: all in 0 .. 1
:: that sum to 1.0
:: that sum to 1.0

Revision as of 14:49, 13 February 2024

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

softmax (sometimes called softargmax, normalized exponential function, and other things )

  • takes a vector of numbers
(any scale)
  • returns a same-length vector of probabilities
all in 0 .. 1
that sum to 1.0


Note that it is not just normalization.

Nor is it only a way to bring out the strongest answer.

The exponent in its internals, plus the "will sum to 1.0 part" will mean things shift around in a non-linear way, so even relative probabilities already in in 0..1 and summing to 1.0 will change, e.g.

softmax([1.0,0.5,0.1]) ~= 0.5, 0.3, 0.2,
softmax([0.5, 0.3, 0.2]) ~= 0.4, 0.31, 0.28



https://en.wikipedia.org/wiki/Softmax_function