zhtokenizer
所属分类:特征抽取
开发工具:C
文件大小:11536KB
下载次数:0
上传日期:2013-07-08 09:07:45
上 传 者:
sh-1993
说明: 中文单词标记器
(Chinese Word Tokenizer)
文件列表:
Makefile (449, 2013-07-08)
buildht.c (4456, 2013-07-08)
config.h (209, 2013-07-08)
htable.h (95, 2013-07-08)
htfunc.c (1020, 2013-07-08)
httest.c (582, 2013-07-08)
utf8util.c (2647, 2013-07-08)
utf8util.h (340, 2013-07-08)
words.weight.txt (15777087, 2013-07-08)
words (0, 2013-07-08)
words\adrd.txt (463297, 2013-07-08)
words\ebqs.txt (458472, 2013-07-08)
words\erbi.txt (368266, 2013-07-08)
words\extr.txt (3146061, 2013-07-08)
words\wbhf.txt (609267, 2013-07-08)
words\wbjd.txt (576496, 2013-07-08)
words\wkzh.txt (10807543, 2013-07-08)
words\yong.txt (440313, 2013-07-08)
zhtokenizer.c (2404, 2013-07-08)
zhtokenizer: Chinese Word Tokenizer
=====
About
-----
This tool is used for tokenize Chinese sentence into words.
How to Compile?
-----
make
How to run?
-----
./zhtokenizer < chinese-text-in-utf8
Where does the dictionary come from?
-----
adrd.txt
ebqs.txt
erbi.txt
extr.txt
wbhf.txt
wbjd.txt
wkzh.txt Chinese Wikipedia title
yong.txt
近期下载者:
相关文件:
收藏者: