No edit summary |
|||
Line 1: | Line 1: | ||
== | == General Information on word embeddings == | ||
For a general explanation look here: | |||
[https://blog.acolyer.org/2016/04/21/the-amazing-power-of-word-vectors/] | |||
==Word2vec== | |||
Made by Google, uses Neural Net, performs good on semantics. | |||
=== Installation + getting started: === | |||
pip install gensim | |||
=== pre trained models | ==Fastword== | ||
Made by Facebbok based on word2vec. Better at capturing syntactic relations (like apparent ---> apparently) see here: | |||
[https://rare-technologies.com/fasttext-and-gensim-word-embeddings/] | |||
Pretrained model files are HUGE | |||
==GloVe== | |||
== pre trained models == | |||
* [https://github.com/Kyubyong/wordvectors https://github.com/Kyubyong/wordvectors: Word2Vec and FastText, Multiple languages, no english, trained on Wikipedia] | * [https://github.com/Kyubyong/wordvectors https://github.com/Kyubyong/wordvectors: Word2Vec and FastText, Multiple languages, no english, trained on Wikipedia] | ||
* [https://github.com/3Top/word2vec-api#where-to-get-a-pretrained-models https://github.com/3Top/word2vec-api Mostly GloVe, some word2vec, English, Trained on News, Wikipedia, Twitter] | * [https://github.com/3Top/word2vec-api#where-to-get-a-pretrained-models https://github.com/3Top/word2vec-api Mostly GloVe, some word2vec, English, Trained on News, Wikipedia, Twitter] | ||
* [https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md: Fasttext, all imaginable languages, trained on Wikipedia] | * [https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md: Fasttext, all imaginable languages, trained on Wikipedia] | ||
Revision as of 09:53, 8 May 2017
General Information on word embeddings
For a general explanation look here: [1]
Word2vec
Made by Google, uses Neural Net, performs good on semantics.
Installation + getting started:
pip install gensim
Fastword
Made by Facebbok based on word2vec. Better at capturing syntactic relations (like apparent ---> apparently) see here: [2] Pretrained model files are HUGE
GloVe
pre trained models
- https://github.com/Kyubyong/wordvectors: Word2Vec and FastText, Multiple languages, no english, trained on Wikipedia
- https://github.com/3Top/word2vec-api Mostly GloVe, some word2vec, English, Trained on News, Wikipedia, Twitter
- https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md: Fasttext, all imaginable languages, trained on Wikipedia