No edit summary |
No edit summary |
||
Line 7: | Line 7: | ||
=== Installation + getting started: === | === Installation + getting started: === | ||
Included in the ''gensim'' package. | |||
To install, just type | |||
<code>pip install gensim</code><br> | <code>pip install gensim</code><br> | ||
into a command window. | |||
Here are some of the things you can do with the model: [http://textminingonline.com/getting-started-with-word2vec-and-glove-in-python]<br> | Here are some of the things you can do with the model: [http://textminingonline.com/getting-started-with-word2vec-and-glove-in-python]<br> | ||
Here is a bit of background information an an explanation how to train your own models: [https://rare-technologies.com/word2vec-tutorial/]. | Here is a bit of background information an an explanation how to train your own models: [https://rare-technologies.com/word2vec-tutorial/]. | ||
Line 17: | Line 24: | ||
=== Installation + getting started: === | === Installation + getting started: === | ||
Included in | Included in the ''gensim'' package. | ||
To install, just type | |||
<code>pip install gensim</code><br> | |||
into a command window. | |||
Documentation is here: [https://radimrehurek.com/gensim/models/wrappers/fasttext.html] | Documentation is here: [https://radimrehurek.com/gensim/models/wrappers/fasttext.html] | ||
Revision as of 11:28, 8 May 2017
General Information on word embeddings
For a general explanation look here: [1]
Word2vec
Made by Google, uses Neural Net, performs good on semantics.
Installation + getting started:
Included in the gensim package.
To install, just type
pip install gensim
into a command window.
Here are some of the things you can do with the model: [2]
Here is a bit of background information an an explanation how to train your own models: [3].
Fastword
Made by Facebook based on word2vec. Better at capturing syntactic relations (like apparent ---> apparently) see here:
[4]
Pretrained model files are HUGE - this will be a problem on computers with less than 16GB Memory
Installation + getting started:
Included in the gensim package.
To install, just type
pip install gensim
into a command window.
Documentation is here: [5]
GloVe
Invented by the Natural language processing group in standford. [6]Uses more conventional math instead of Neural Network "Black Magic". Seems to perform very slightly less well than Word2vec and FastWord.
pre trained models
- https://github.com/Kyubyong/wordvectors: Word2Vec and FastText, Multiple languages, no english, trained on Wikipedia
- https://github.com/3Top/word2vec-api Mostly GloVe, some word2vec, English, Trained on News, Wikipedia, Twitter
- https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md: Fasttext, all imaginable languages, trained on Wikipedia