aerobio

所属分类:生物医药技术
开发工具:Clojure
文件大小:0KB
下载次数:0
上传日期:2023-08-21 18:19:20
上 传 者sh-1993
说明:  可扩展的完整DAG流计算服务器,具有RNA序列、Tn序列、WG序列和术语序列的服务和作业。,
(Extensible full DAG streaming computation server with services and jobs for RNA-Seq, Tn-Seq, WG-Seq and Term-Seq.,)

文件列表:
Jobs/ (0, 2023-11-06)
Jobs/make-bam-bai-job-template.clj (143, 2023-11-06)
Jobs/make-bam-bai-paired-job-template.clj (150, 2023-11-06)
Jobs/rnaseq-compare-job-template.clj (144, 2023-11-06)
Jobs/rnaseq-phase-0-job-template.clj (144, 2023-11-06)
Jobs/rnaseq-phase-0b-job-template.clj (146, 2023-11-06)
Jobs/rnaseq-phase-0c-job-template.clj (146, 2023-11-06)
Jobs/rnaseq-phase-0d-job-template.clj (146, 2023-11-06)
Jobs/rnaseq-phase-1-job-template.clj (144, 2023-11-06)
Jobs/rnaseq-phase-2-job-template.clj (144, 2023-11-06)
Jobs/rnaseq-phase-2noasm-job-template.clj (155, 2023-11-06)
Jobs/rnaseq-star-phase-1-job-template.clj (149, 2023-11-06)
Jobs/rnaseq-xcompare-job-template.clj (146, 2023-11-06)
Jobs/termseq-phase-0-job-template.clj (145, 2023-11-06)
Jobs/termseq-phase-0b-job-template.clj (147, 2023-11-06)
Jobs/termseq-phase-0c-job-template.clj (147, 2023-11-06)
Jobs/termseq-phase-1-job-template.clj (145, 2023-11-06)
Jobs/termseq-phase-2-rnaseq-job-template.clj (144, 2023-11-06)
Jobs/tnseq-aggregate-job-template.clj (152, 2023-11-06)
Jobs/tnseq-bt1-phase-1-job-template.clj (151, 2023-11-06)
Jobs/tnseq-bt2-phase-1-job-template.clj (143, 2023-11-06)
Jobs/tnseq-compare-job-template.clj (149, 2023-11-06)
Jobs/tnseq-phase-0-job-template.clj (143, 2023-11-06)
Jobs/tnseq-phase-0b-job-template.clj (145, 2023-11-06)
Jobs/tnseq-phase-0c-job-template.clj (145, 2023-11-06)
Jobs/tnseq-phase-0d-job-template.clj (145, 2023-11-06)
Jobs/tnseq-phase-1-job-template.clj (151, 2023-11-06)
Jobs/tnseq-phase-2-job-template.clj (143, 2023-11-06)
Jobs/tnseq-phase-2b-job-template.clj (144, 2023-11-06)
Jobs/wgseq-phase-0-job-template.clj (143, 2023-11-06)
Jobs/wgseq-phase-0b-job-template.clj (145, 2023-11-06)
Jobs/wgseq-phase-0c-job-template.clj (145, 2023-11-06)
Jobs/wgseq-phase-2-job-template.clj (143, 2023-11-06)
LICENSE (1067, 2023-11-06)
Scripts/ (0, 2023-11-06)
Scripts/aggregate.pl (9698, 2023-11-06)
Scripts/calc-loess-fitness.py (47499, 2023-11-06)
Scripts/calc_fitness.py (43828, 2023-11-06)
Scripts/cummerbund.r (1173, 2023-11-06)
... ...

# Aerobio An extensible full DAG streaming computation server with services and jobs for RNA-Seq, Tn-Seq, WG-Seq and Term-Seq. aerobio logo **Aerobio** is a system for creating _services_ and connecting them together to form _jobs_. A _service_ defines a piece of computation and may be _implemented_ by external tooling or code specific to the computation involved. A _job_ defines a directed acyclical graph (DAG) composed of service nodes and data stream edges. Nodes may have multiple inputs and multiple outputs. Both services and jobs are specified entirely by data (EDN or JSON). Jobs are _realized_ by instantiating the node processing as (OS level) threads of execution and the streaming connections as aysnchronous channels. A realized job is roughly equivalent to the notion of a 'pipeline'. While the system is general enough to apply to a range of domains, the intended target is specifically oriented toward high throughput sequencing (HTS) analysis of RNA-Seq, Tn-SEq, WG-Seq, and Term-Seq data sets. This is realized by supporting the specification of experiment data processing by sets of spreadsheets. These spreadsheets are simple in structure and use the terminology of biologists. They are used to automatically construct jobs that will perform all necessary analysis for all samples and all comparisons. This includes: * Conversion of sequencer output to fastq data sets * Quality filtering of the data sets * Splitting of the data sets according to samples * Alignment of all sample reads creating sorted and indexed BAM and MAP files * Gene counts for all samples creating tables for futher processing * Differential Gene Expression analysis for RNA-Seq * Fitness and aggregation analysis for Tn-Seq * SNP analysis for WG-Seq * Plots and charts for all analyses All output is placed in configurable output locations according canonical and consistent naming conventions. Further, all data is directly accessible via supplied web interfaces.

近期下载者

相关文件


收藏者