lda-Chinese

所属分类:Java编程
开发工具:Java
文件大小:61KB
下载次数:51
上传日期:2015-04-06 10:02:33
上 传 者fwp
说明:  用于中文文本的lda主题模型的代码,供学习主题模型的参考
(A Chinese LDA topic model code for learning topic model reference )

文件列表:
lda Chinese\lda\bin\edu\tongji\lab\lda\Constants.class (568, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Conversion.class (868, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Dictionary.class (4184, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Document.class (1861, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Estimator.class (6115, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Inferencer.class (5698, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\LDA.class (2275, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\LDADataset.class (5708, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\LDAOption.class (1123, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Model.class (14328, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\Pair.class (1135, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\run\Inference.class (1582, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\run\test.class (401, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\lda\utils\FileUtils.class (2319, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\urn\UDictionary.class (1552, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\urn\Urn$R.class (614, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\urn\Urn.class (1297, 2014-06-16)
lda Chinese\lda\bin\edu\tongji\lab\urn\UrnReader.class (3202, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Constants.java (1386, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Conversion.java (1393, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Dictionary.java (4557, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Document.java (2473, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Estimator.java (5554, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Inferencer.java (6536, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\LDA.java (1940, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\LDADataset.java (8128, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\LDAOption.java (1924, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Model.java (21595, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\Pair.java (1564, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\run\Inference.java (2604, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\run\test.java (106, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\lda\utils\FileUtils.java (1210, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\urn\UDictionary.java (655, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\urn\Urn.java (661, 2014-06-16)
lda Chinese\lda\src\edu\tongji\lab\urn\UrnReader.java (1583, 2014-06-16)
lda Chinese\retriever\com\consensus\retriever\common\utils\FPGenerator.java (18960, 2014-06-16)
lda Chinese\retriever\com\consensus\retriever\HTMLRetriever.java (5608, 2014-06-16)
lda Chinese\retriever\com\consensus\retriever\model\WebPageInfo.java (2034, 2014-06-16)
... ...

: is the name of a LDA model corresponding to the time step it was saved on the hard disk. For example, the name of the model was saved at the Gibbs sampling iteration 400th will be model-00400. Similarly, the model was saved at the 1200th iteration is model-01200. The model name of the last Gibbs sampling iteration is model-final. .others: This file contains some parameters of LDA model, such as: alpha=? beta=? ntopics=? # i.e., number of topics ndocs=? # i.e., number of documents nwords=? # i.e., the vocabulary size liter=? # i.e., the Gibbs sampling iteration at which the model was saved .phi: This file contains the word-topic distributions, i.e., p(wordw|topict). Each line is a topic, each column is a word in the vocabulary .theta: This file contains the topic-document distributions, i.e., p(topict|documentm). Each line is a document and each column is a topic. .tassign: This file contains the topic assignments for words in training data. Each line is a document that consists of a list of : .twords: This file contains twords most likely words of each topic. twords is specified in the command.

近期下载者

相关文件


收藏者