Pointwise Mutual Information

From Helpful
Jump to navigation Jump to search
This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


Pointwise Mutual Information (PMI) quantifies coincidence from their joint distribution and their individual distributions, assuming independence.

A little more concretely, PMI calculates probability of one word following another divided by the probability of each appearing at all:

PMI(word1,word2) = log(  P(word1,word2) / ( P(word2) * ​P(word1) )​  )


Notes:

  • the base for the log isn't important when this is a unitless thing used only for relative ranking
  • 'Pointwise' mostly just points out we are looking at specific cases at a time.
Mutual Information without the 'pointwise' used on variables/distributions (e.g. based on all possible events)



See also:

  • W Croft et al. (2010) "Search engines: information retrieval in practice"
  • F Role, M Nadif (2011), "Handling the Impact of Low Frequency Events on Co-occurrence based Measures of Word Similarity - A Case Study of Pointwise Mutual Information"