zhtokenizer

所属分类:特征抽取
开发工具:C
文件大小:11536KB
下载次数:0
上传日期:2013-07-08 09:07:45
上 传 者sh-1993
说明:  中文单词标记器
(Chinese Word Tokenizer)

文件列表:
Makefile (449, 2013-07-08)
buildht.c (4456, 2013-07-08)
config.h (209, 2013-07-08)
htable.h (95, 2013-07-08)
htfunc.c (1020, 2013-07-08)
httest.c (582, 2013-07-08)
utf8util.c (2647, 2013-07-08)
utf8util.h (340, 2013-07-08)
words.weight.txt (15777087, 2013-07-08)
words (0, 2013-07-08)
words\adrd.txt (463297, 2013-07-08)
words\ebqs.txt (458472, 2013-07-08)
words\erbi.txt (368266, 2013-07-08)
words\extr.txt (3146061, 2013-07-08)
words\wbhf.txt (609267, 2013-07-08)
words\wbjd.txt (576496, 2013-07-08)
words\wkzh.txt (10807543, 2013-07-08)
words\yong.txt (440313, 2013-07-08)
zhtokenizer.c (2404, 2013-07-08)

zhtokenizer: Chinese Word Tokenizer ===== About ----- This tool is used for tokenize Chinese sentence into words. How to Compile? ----- make How to run? ----- ./zhtokenizer < chinese-text-in-utf8 Where does the dictionary come from? ----- adrd.txt ebqs.txt erbi.txt extr.txt wbhf.txt wbjd.txt wkzh.txt Chinese Wikipedia title yong.txt

近期下载者

相关文件


收藏者