stream2es
所属分类:其他
开发工具:Clojure
文件大小:0KB
下载次数:0
上传日期:2018-10-29 20:46:44
上 传 者:
sh-1993
说明: 将数据流式传输到ES(维基百科、Twitter、stdin或其他ESE),
(Stream data into ES (Wikipedia, Twitter, stdin, or other ESes),)
文件列表:
Makefile (464, 2017-06-20)
etc/ (0, 2017-06-20)
etc/log4j.properties (448, 2017-06-20)
project.clj (1154, 2017-06-20)
src/ (0, 2017-06-20)
src/stream2es/ (0, 2017-06-20)
src/stream2es/auth.clj (1791, 2017-06-20)
src/stream2es/bootstrap.clj (2345, 2017-06-20)
src/stream2es/es.clj (5070, 2017-06-20)
src/stream2es/help.clj (729, 2017-06-20)
src/stream2es/http.clj (2636, 2017-06-20)
src/stream2es/log.clj (1198, 2017-06-20)
src/stream2es/main.clj (13730, 2017-06-20)
src/stream2es/opts.clj (3301, 2017-06-20)
src/stream2es/size.clj (1048, 2017-06-20)
src/stream2es/stream.clj (677, 2017-06-20)
src/stream2es/stream/ (0, 2017-06-20)
src/stream2es/stream/es.clj (3618, 2017-06-20)
src/stream2es/stream/generator.clj (3601, 2017-06-20)
src/stream2es/stream/stdin.clj (1666, 2017-06-20)
src/stream2es/stream/twitter.clj (6876, 2017-06-20)
src/stream2es/stream/wiki.clj (2519, 2017-06-20)
src/stream2es/util/ (0, 2017-06-20)
src/stream2es/util/data.clj (1187, 2017-06-20)
src/stream2es/util/io.clj (1052, 2017-06-20)
src/stream2es/util/rate.clj (933, 2017-06-20)
src/stream2es/util/stacktrace.clj (354, 2017-06-20)
src/stream2es/util/string.clj (362, 2017-06-20)
src/stream2es/util/time.clj (366, 2017-06-20)
src/stream2es/util/typed.clj (486, 2017-06-20)
src/stream2es/version.clj (125, 2017-06-20)
test/ (0, 2017-06-20)
test/log4j.properties (23, 2017-06-20)
test/stream2es/ (0, 2017-06-20)
test/stream2es/test/ (0, 2017-06-20)
test/stream2es/test/main.clj (2650, 2017-06-20)
test/stream2es/test/opts.clj (273, 2017-06-20)
# stream2es
Standalone utility to stream different inputs into Elasticsearch.
## Read This First
*If you've just wandered here, first check out [Logstash](http://github.com/elasticsearch/logstash). It's a much more general tool, and one of our featured products. If for some reason it doesn't do something that's important to you, create an issue there. stream2es is a dev tool that originated before the author knew much about Logstash. That said, there are some important differences that are specific to Elasticsearch. stream2es supports bulks by byte-length (`--bulk-bytes`) instead of doc count, which is crucial with docs of varying size. It also supports exporting raw bulks via `--tee-bulk` to a hashed dir on the filesystem, and you can make the incoming stream finite with `--max-docs`.*
## Install
You'll need Java 8+. Run `java -version` to make sure.
### Unix
Download `stream2es` and make it executable:
```
% curl -O download.elasticsearch.org/stream2es/stream2es; chmod +x stream2es
```
### Windows
```
> curl -O download.elasticsearch.org/stream2es/stream2es
> java -jar stream2es help
```
# Usage
## stdin
By default, `stream2es` reads JSON documents from stdin.
```
% echo '{"f":1}' | stream2es
2014-10-08T12:29:56.318-0500 INFO 00:00.116 8.6d/s 0.4K/s (0.0mb) indexed 1 streamed 1 errors 0
%
```
If you want more logging, set `--log debug`. If you don't want any output, set `--log warn`.
## Wikipedia
Index the latest Wikipedia article dump.
% stream2es wiki --target http://localhost:9200/tmp --log debug
create index http://localhost:9200/tmp
stream wiki from http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2
^Cstreamed 1158 docs 1082 bytes xfer 15906901 errors 0
If you're at a café or want to use a local copy of the dump, supply `--source`:
% ./stream2es wiki --max-docs 5 --source /d/data/enwiki-20121201-pages-articles.xml.bz2
Note that if you live-stream the WMF-hosted dump, it will cut off after a while. Grab a torrent and index it locally if you need more than a few thousand docs.
## Generator
`stream2es` can fuzz data for you. It can create blank documents, or documents with integer fields, or documents with string fields if you supply a dictionary.
Blank documents are easy:
```
stream2es generator
```
Ints need to know how big you want them. This template would give you a single field with values between `0` and `127`, inclusive.
```
stream2es generator --fields f1:int:128
```
To add a string, we need to add a template for it, and a file of newline-separated lines of text. Given a field template of `NAME:str:N`, `stream2es` will select `N` random words from the dictionary for each field.
```
# zsh
% stream2es generator --fields f1:int:128,f2:str:2 --dictionary <(/bin/echo -e "foo\nbar\nbaz")
#### same as:
% stream2es generator --fields f1:int:128,f2:str:2 --dictionary /dev/stdin --max-docs 5 <
Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.
近期下载者:
相关文件:
收藏者: