source-archive
所属分类:Windows编程
开发工具:C/C++
文件大小:218KB
下载次数:0
上传日期:2018-10-24 00:10:48
上 传 者:
Feryandi
说明: Word2vec for natural language processing
(Word2vec for natural language processing to create model based on the data input and create vector based on the words)
文件列表:
word2vec (0, 2016-03-18)
word2vec\.svn (0, 2016-03-18)
word2vec\.svn\all-wcprops (57, 2016-03-18)
word2vec\.svn\tmp (0, 2016-03-18)
word2vec\.svn\tmp\prop-base (0, 2016-03-18)
word2vec\.svn\tmp\props (0, 2016-03-18)
word2vec\.svn\tmp\text-base (0, 2016-03-18)
word2vec\.svn\prop-base (0, 2016-03-18)
word2vec\.svn\props (0, 2016-03-18)
word2vec\.svn\text-base (0, 2016-03-18)
word2vec\.svn\entries (189, 2016-03-18)
word2vec\trunk (0, 2016-03-18)
word2vec\trunk\demo-phrases.sh (853, 2016-03-18)
word2vec\trunk\demo-word.sh (272, 2016-03-18)
word2vec\trunk\questions-phrases.txt (168209, 2016-03-18)
word2vec\trunk\demo-word-accuracy.sh (414, 2016-03-18)
word2vec\trunk\makefile (718, 2016-03-18)
word2vec\trunk\word2phrase.c (9386, 2016-03-18)
word2vec\trunk\compute-accuracy.c (5241, 2016-03-18)
word2vec\trunk\.svn (0, 2016-03-18)
word2vec\trunk\.svn\all-wcprops (1680, 2016-03-18)
word2vec\trunk\.svn\tmp (0, 2016-03-18)
word2vec\trunk\.svn\tmp\prop-base (0, 2016-03-18)
word2vec\trunk\.svn\tmp\props (0, 2016-03-18)
word2vec\trunk\.svn\tmp\text-base (0, 2016-03-18)
word2vec\trunk\.svn\prop-base (0, 2016-03-18)
word2vec\trunk\.svn\prop-base\demo-train-big-model-v1.sh.svn-base (30, 2016-03-18)
word2vec\trunk\.svn\props (0, 2016-03-18)
word2vec\trunk\.svn\text-base (0, 2016-03-18)
word2vec\trunk\.svn\text-base\word2phrase.c.svn-base (9386, 2016-03-18)
word2vec\trunk\.svn\text-base\questions-words.txt.svn-base (603955, 2016-03-18)
word2vec\trunk\.svn\text-base\demo-classes.sh.svn-base (358, 2016-03-18)
word2vec\trunk\.svn\text-base\makefile.svn-base (718, 2016-03-18)
word2vec\trunk\.svn\text-base\compute-accuracy.c.svn-base (5241, 2016-03-18)
word2vec\trunk\.svn\text-base\distance.c.svn-base (4557, 2016-03-18)
word2vec\trunk\.svn\text-base\demo-analogy.sh.svn-base (631, 2016-03-18)
word2vec\trunk\.svn\text-base\word2vec.c.svn-base (26184, 2016-03-18)
word2vec\trunk\.svn\text-base\demo-phrases.sh.svn-base (853, 2016-03-18)
... ...
Tools for computing distributed representtion of words
------------------------------------------------------
We provide an implementation of the Continuous Bag-of-Words (CBOW) and the Skip-gram model (SG), as well as several demo scripts.
Given a text corpus, the word2vec tool learns a vector for every word in the vocabulary using the Continuous
Bag-of-Words or the Skip-Gram neural network architectures. The user should to specify the following:
- desired vector dimensionality
- the size of the context window for either the Skip-Gram or the Continuous Bag-of-Words model
- training algorithm: hierarchical softmax and / or negative sampling
- threshold for downsampling the frequent words
- number of threads to use
- the format of the output word vector file (text or binary)
Usually, the other hyper-parameters such as the learning rate do not need to be tuned for different training sets.
The script demo-word.sh downloads a small (100MB) text corpus from the web, and trains a small word vector model. After the training
is finished, the user can interactively explore the similarity of the words.
More information about the scripts is provided at https://code.google.com/p/word2vec/
近期下载者:
相关文件:
收藏者: