NLPL word embeddings repository

brought to you by LTG Oslo (work in progress)

Filter your search by:

Corpora:

Algorithms:

Lemmatization:

All models

ID Download link Vector size Window Corpus Vocabulary size Algorithm Lemmatization
0 Download 300 10 British National Corpus
163473 Continuous Skipgram True
1 Download 300 None Google News 2013
2883863 Continuous Skipgram True
2 Download 300 5 Norsk Aviskorpus/NoWaC
306943 Continuous Skipgram True
3 Download 300 5 English Wikipedia Dump of February 2017
296630 Continuous Skipgram True
4 Download 300 2 Gigaword 5th Edition
314815 Continuous Skipgram True
5 Download 300 5 English Wikipedia Dump of February 2017
273992 Continuous Skipgram True
6 Download 300 5 English Wikipedia Dump of February 2017
302866 Continuous Skipgram False
7 Download 300 5 English Wikipedia Dump of February 2017
273930 Global Vectors True
8 Download 300 5 English Wikipedia Dump of February 2017
302815 Global Vectors False
9 Download 300 5 English Wikipedia Dump of February 2017
273930 fastText Skipgram True
10 Download 300 5 English Wikipedia Dump of February 2017
302815 fastText Skipgram False
11 Download 300 5 Gigaword 5th Edition
261794 Continuous Skipgram True
12 Download 300 5 Gigaword 5th Edition
292479 Continuous Skipgram False
13 Download 300 5 Gigaword 5th Edition
262269 Global Vectors True
14 Download 300 5 Gigaword 5th Edition
292967 Global Vectors False
15 Download 300 5 Gigaword 5th Edition
262269 fastText Skipgram True
16 Download 300 5 Gigaword 5th Edition
292967 fastText Skipgram False
17 Download 300 5 English Wikipedia Dump of February 2017
Gigaword 5th Edition
259882 Continuous Skipgram True
18 Download 300 5 English Wikipedia Dump of February 2017
Gigaword 5th Edition
291186 Continuous Skipgram False
19 Download 300 5 English Wikipedia Dump of February 2017
Gigaword 5th Edition
260073 Global Vectors True
20 Download 300 5 English Wikipedia Dump of February 2017
Gigaword 5th Edition
291392 Global Vectors False
21 Download 300 5 English Wikipedia Dump of February 2017
Gigaword 5th Edition
260073 fastText Skipgram True
22 Download 300 5 English Wikipedia Dump of February 2017
Gigaword 5th Edition
291392 fastText Skipgram False

This page accompanies the following paper:

Fares, Murhaf; Kutuzov, Andrei; Oepen, Stephan & Velldal, Erik (2017). Word vectors, reuse, and replicability: Towards a community repository of large-text resources, In Jörg Tiedemann (ed.), Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22-24 May 2017. Linköping University Electronic Press. ISBN 978-91-7685-601-7