Two variants of dataset are available in separate files:
![]() | Name | Last modified | Size | Description |
---|---|---|---|---|
![]() | Parent Directory | - | ||
![]() | pl-embeddings-cbow.txt | 2016-01-04 23:22 | 855M | generated using continuous-bag-of-words method |
![]() | pl-embeddings-skip.txt | 2016-01-04 23:24 | 854M | generated using skip-gram method |
The first line of each dataset contains two numbers: number of words in a dictionary and a number of dimensions of the word embeddings.
The following lines contain the word and space-separated list of numbers that form the word embedding.