groonga-tokenizer-white
所属分类:特征抽取
开发工具:C
文件大小:19KB
下载次数:0
上传日期:2017-02-20 11:47:08
上 传 者:
sh-1993
说明: Gronga标记器白色,,
(groonga-tokenizer-white,,)
文件列表:
COPYING (26530, 2017-02-20)
Makefile.am (97, 2017-02-20)
autogen.sh (37, 2017-02-20)
configure.ac (1705, 2017-02-20)
test (0, 2017-02-20)
test\Makefile.am (0, 2017-02-20)
test\run-test.sh (2039, 2017-02-20)
test\suite (0, 2017-02-20)
test\suite\config.expected (358, 2017-02-20)
test\suite\config.test (219, 2017-02-20)
test\suite\select.expected (947, 2017-02-20)
test\suite\select.test (545, 2017-02-20)
test\suite\skip.expected (230, 2017-02-20)
test\suite\skip.test (161, 2017-02-20)
test\suite\two.expected (568, 2017-02-20)
test\suite\two.test (211, 2017-02-20)
tokenizers (0, 2017-02-20)
tokenizers\Makefile.am (259, 2017-02-20)
tokenizers\white.c (7547, 2017-02-20)
# Goonga white tokenizer
* ``TokenWhite``
``TABLE_PAT_KEY``型の``white_words``テーフルに登録されたキーのみてトークナイスするトークナイサー。
環境変数``GRN_WHITE_TABLE_NAME``てテーフル名の変更か可能。
もしくは``tokenizer-white.table``のコンフィクてテーフル名の変更か可能。
```
config_set tokenizer-white.table white_terms
```
```
plugin_register tokenizers/white
[[0,0.0,0.0],true]
table_create white_words TABLE_PAT_KEY ShortText
[[0,0.0,0.0],true]
load --table white_words
[
{"_key": "装置"},
{"_key": "情報"}
]
[[0,0.0,0.0],2]
tokenize TokenWhite "情報処理装置は装置てある"
[
[
0,
0.0,
0.0
],
[
{
"value": "情報",
"position": 0,
"force_prefix": false
},
{
"value": "装置",
"position": 1,
"force_prefix": false
},
{
"value": "装置",
"position": 2,
"force_prefix": false
}
]
]
```
## Install
Install libgroonga-dev.
Build this tokenizer.
% ./configure
% make
% sudo make install
## Usage
Register `tokenizers/white`:
% groonga DB
> register tokenizers/white
## License
LGPL 2.1. See COPYING for details.
近期下载者:
相关文件:
收藏者: