ntto

所属分类:WEB开发
开发工具:GO
文件大小:0KB
下载次数:0
上传日期:2019-08-05 14:28:57
上 传 者sh-1993
说明:  小型n-三元组到行分隔JSON转换器和前缀切割器。,
(Small n-triples to line delimited JSON converter and prefix cutter.,)

文件列表:
.travis.yml (42, 2019-08-05)
LICENSE (1156, 2019-08-05)
Makefile (967, 2019-08-05)
RULES (800, 2019-08-05)
cmd/ (0, 2019-08-05)
cmd/ntto/ (0, 2019-08-05)
cmd/ntto/ntto.go (4381, 2019-08-05)
common.go (3763, 2019-08-05)
common_test.go (5786, 2019-08-05)
debian/ (0, 2019-08-05)
debian/ntto/ (0, 2019-08-05)
debian/ntto/DEBIAN/ (0, 2019-08-05)
debian/ntto/DEBIAN/control (221, 2019-08-05)
go.mod (37, 2019-08-05)
packaging/ (0, 2019-08-05)
packaging/buildrpm.sh (814, 2019-08-05)
packaging/ntto.spec (1612, 2019-08-05)
rules.go (10238, 2019-08-05)

ntto ==== [![Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows.](https://www.repostatus.org/badges/latest/inactive.svg)](https://www.repostatus.org/#inactive) Minimal n-triples toolkit. It can: * shrink n-triples by applying namespace abbreviations (given some rules) * convert n-triples to line delimited JSON (.ldj) To list the abbreviation rules, run: $ ntto -d To create an abbreviated NT file from an NT file, run: $ ntto -o OUTPUT.NT -a FILE.nt To create an abbreviated JSON file from an NT file, run: $ ntto -a -j FILE.nt > OUTPUT.LDJ To create an abbreviated JSON file from an NT file while ignoring conversion errors, run: $ ntto -a -j -i FILE.nt > OUTPUT.LDJ To create an abbreviated JSON file from an NT file while ignoring conversion errors and using a custom RULES file, run: $ ntto -r RULES -a -j -i FILE.nt > OUTPUT.LDJ Installation ------------ RPM and DEB packages can be found under [releases](https://github.com/miku/ntto/releases). With a proper Go setup, a $ go get github.com/miku/ntto/cmd/ntto should work as well. Usage ----- $ ntto Usage: ntto [OPTIONS] FILE -a abbreviate n-triples using rules -c dump constructed sed command and exit -cpuprofile string write cpu profile to file -d dump rules and exit -i ignore conversion errors -j convert nt to json -n string string to indicate empty string replacement (default "") -o string output file to write result to -r string path to rules file, use built-in if none given -v prints current version and exits -w int parallelism measure (default 4) Mode of operation ----------------- `ntto` takes a RULES file (alternatively uses some [hardwired](https://github.com/miku/ntto/blob/master/rules.go) rules) to abbreviate common prefixes in a n-triple file. `ntto` does not do the replacements itself, but outsources it to external programs, like `replace` or `perl`. With the help of `replace` ntto can shorten up to 3M lines per second. The resulting file size can be up to 50% of the size of the original file. Example rules file ------------------ $ cat RULES # example rules file dbp http://dbpedia.org/resource/ gnd http://d-nb.info/gnd/ dnbes http://d-nb.info/standards/elementset/gnd# dnbac http://d-nb.info/standards/vocab/gnd/geographic-area-code# dnbv http://d-nb.info/standards/vocab/gnd/ viaf http://viaf.org/viaf/ frbr http://rdvocab.info/uri/schema/FRBRentitiesRDA/ rdgr http://rdvocab.info/ElementsGr2/ # empty lines are ignored, as are comments foaf http://xmlns.com/foaf/0.1/ rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# rdfs http://www.w3.org/2000/01/rdf-schema# schema http://schema.org/ dc http://purl.org/dc/elements/1.1/ dcterms http://purl.org/dc/terms/ Performance data point ---------------------- $ wc -l file.nt 114171541 $ time ntto -o output.nt -a file.nt real 1m51.202s user 1m3.626s sys 0m13.602s $ time ntto -a -j file.nt > output.ldj real 15m47.872s user 16m19.516s sys 2m3.013s Sometimes, less is more, but YMMV: $ time ntto -w 2 -a -j file.nt > output.ldj real 12m3.619s user 15m17.422s sys 2m14.430s

近期下载者

相关文件


收藏者