*2vec

From Helpful
Jump to navigation Jump to search

Consider a 2D where you place your music as, say, happy/sad on one axis, and well known/indie on another? It would show groups, it would show how similar artists are.


Making a vector of that just means taking the location in that plot, which means you can make computers do that similarity math more easily.


word2vec and other 2vec things (and many things not called something2vec) are like that, except

  • there are too many artists, so we tend to automatically generate this somehow
  • it turns out you need more than two dimensions to capture everyting you want to compare

That automation means that you lose sight of what exactly each axis means, but you may not care as long as the similarity and groups thing keeps working well enough for you to to build something with.


word2vec

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

word2vec is one of many ways to put vectors to words.

T Mikolov et al. (2013), "Efficient Estimation of Word Representations in Vector Space" is probably the paper that kicked off this particular dense-vector idea, and due to those origins word2vec now typically refers to using either bag-of-words and skip-gram processing for a specific learner.

Word2vec could be intuited as building a classifier that

  • predicts what words appear in in the middle of any particular context of other words,

and/or

  • predicts what context appears around a word,

which both happens to do a decent task of classifying that word, and is one of many approaches from the distributional hypothesis idea, and ends up encoding some degree of semantic relations.


That mentioned paper mentions

Uses skip-grams as a concept/building block. Some people refer to this technique as just 'skip-gram' without the 'continuous',

but this may come from not really reading the paper you're copy-pasting the image from?

seems to be better at less-common words, but slower


(NN implies one-hot coding, so not small, but it turns out to be moderately efficient(verify))



doc2vec

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


💤 For context:

If word2vec lets you you can represent words, how about representing larger units like sentences, paragraphs, or entire documents?

You would now feed in things of differing length, so at a low mechanical level, At a mechanical level, we now have variable-length input and output, how do we deal with that?

Do you add them together, do you average them? That does a basic useful thing, yes. And seems to works better than trying to solve it with n-grams.

But if your goal is encoding the combination of these words, then there are ways to do a little better with not too much extra complexity.



doc2vec can be seen as an adaptation of word2vec that deals with larger units like sentences, paragraphs, or entire documents, roughly be considering more context.

Roughly speaking, feed word2vec into a network netwrk again(verify).

Compared to word2vec:

Distributed Memory Model of Paragraph Vectors (PV-DM) is analogous to word2vec's cbow
Paragraph Vector - Distributed Bag of Words (PV-DBOW) is analogous to skip-gram model

...with the largest differences lying in the way that they consider further context.


The fuzzy way that its meaning varies with context also ends up capturing some amount of meaning within the context

top2vec

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

top2vec is a topic modelling method, so it has its own idea about context.


It can be considered an extension to both word2vec and doc2vec ideas, in that it uses both word and document vectors to estimate the distribution of topics in a set of documents.

This is one way to do topic modelling, among others.

tok2vec

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.


sense2vec

Also

wikipedia2Vec

https://wikipedia2vec.github.io/wikipedia2vec/

https://wikipedia2vec.github.io/demo/

RDF2vec