GMU:The Hidden Layer:Topics: Difference between revisions

Revision as of 11:35, 8 May 2017

General Information on word embeddings

For a general explanation look here: [1]

Word2vec

Made by Google, uses Neural Net, performs good on semantics.

Installation + getting started:

Included in the gensim package.

To install, just type

pip install gensim

into a command window.

Here are some of the things you can do with the model: [2]
Here is a bit of background information an an explanation how to train your own models: [3].

Fastword

Made by Facebook based on word2vec. Better at capturing syntactic relations (like apparent ---> apparently) see here: [4]

Pretrained model files are HUGE - this will be a problem on computers with less than 16GB Memory

Installation + getting started:

Included in the gensim package.

To install, just type

pip install gensim

into a command window.

Documentation is here: [5]

GloVe

Invented by the Natural language processing group in standford [6]. Uses more conventional math instead of Neural Network "Black Magic" [7]. Seems to perform just slightly less well than Word2vec and FastWord.

@@ Line 35: / Line 35: @@
 ==GloVe==
-Invented by the Natural language processing group in standford. [https://nlp.stanford.edu/projects/glove/]Uses more conventional math instead of Neural Network "Black Magic". Seems to perform very slightly less well than Word2vec and FastWord.
+Invented by the Natural language processing group in standford [https://nlp.stanford.edu/projects/glove/]. Uses more conventional math instead of Neural Network "Black Magic" [https://www.quora.com/How-is-GloVe-different-from-word2vec]. Seems to perform just slightly less well than Word2vec and FastWord.
 == pre trained models ==