Two variants of dataset are available in separate files:
| Name | Last modified | Size | Description | |
|---|---|---|---|---|
| Parent Directory | - | |||
| pl-embeddings-skip.txt | 2016-01-04 23:24 | 854M | generated using skip-gram method | |
| pl-embeddings-cbow.txt | 2016-01-04 23:22 | 855M | generated using continuous-bag-of-words method | |
The first line of each dataset contains two numbers: number of words in a dictionary and a number of dimensions of the word embeddings.
The following lines contain the word and space-separated list of numbers that form the word embedding.