pasta:PASTA(使用SATe和传递性的实用对齐)

  • f2_825153
    了解作者
  • 1.9MB
    文件大小
  • zip
    文件格式
  • 0
    收藏次数
  • VIP专享
    资源类型
  • 0
    下载次数
  • 2022-06-13 03:52
    上传日期
这是在和 JCB 中发布的 PASTA(使用 Saté 和 TrAnsitivity 的实用对齐)算法的实现: Mirarab S、Nguyen N、Warnow T. PASTA:超大多序列比对。 夏兰河,编辑。 Res Comput Mol Biol。 2014:177-191。 Mirarab S、Nguyen N、Guo S、Wang LS、Kim J、Warnow T. PASTA:核苷酸和氨基酸序列的超大多序列比对。 J 计算生物学。 2015;22(5):377-386。 。 最新版本包括此处描述的新分解技术: Balaban、Metin、Niema Moshiri、Uyen Mai 和 Siavash Mirarab。 “TreeCluster:使用系统发育树聚类生物序列。” BioRxiv, 2019, 591388. doi:10.1101/591388。 接
pasta-master.zip
内容介绍
This is an implementation of the PASTA (Practical Alignment using Saté and TrAnsitivity) algorithm published in [RECOMB-2014](http://link.springer.com/chapter/10.1007%2F978-3-319-05269-4_15#) and JCB: * Mirarab S, Nguyen N, Warnow T. PASTA: ultra-large multiple sequence alignment. Sharan R, ed. Res Comput Mol Biol. 2014:177-191. * Mirarab S, Nguyen N, Guo S, Wang L-S, Kim J, Warnow T. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences. J Comput Biol. 2015;22(5):377-386. [doi:10.1089/cmb.2014.0156](http://online.liebertpub.com/doi/abs/10.1089/cmb.2014.0156). The latest version includes a new decomposition technique described here: * Balaban, Metin, Niema Moshiri, Uyen Mai, and Siavash Mirarab. “TreeCluster : Clustering Biological Sequences Using Phylogenetic Trees.” BioRxiv, 2019, 591388. doi:10.1101/591388. #### Contact: All questions and inquires should be addressed to our user email group: `pasta-users@googlegroups.com`. Please check our [Tutorial](pasta-doc/pasta-tutorial.md) and [previous posts](https://groups.google.com/forum/#!forum/pasta-users) before sending new requests. #### Developers * The code and the algorithm are developed by Siavash Mirarab and Tandy Warnow, with help from Nam Nguyen. The latest version of the code includes a new code decomposition designed and implemented by [Uyen Mai](https://github.com/uym2). * The current PASTA code is heavily based on the [SATé code](http://phylo.bio.ku.edu/software/sate/sate.html) developed by Mark Holder's group at KU. Refer to sate-doc directory for documentation of the SATé code, including the list of authors, license, etc. * [Niema Moshiri](https://github.com/niemasd) has contributed to the import to dendropy 4 and python 3 and to the Docker image. **Documentation**: In addition to this README file, you can consult our [Tutorial](pasta-doc/pasta-tutorial.md). INSTALLATION === You have four options for installing PASTA. - **Windows:** If you have a Windows machine, currently using the Docker image or the Virtual Machine (VM) image we provide is your only option. Among those two options, Docker is the preferred method. - **Linux:** If you have Linux (or other \*nix systems), you can still use Docker or VM, but downloading the code from github and installing it is what we recommend. - **MAC:** We have four options for MAC: VM, Docker, installing from the code, and downloading the .dmg file. If you mostly use the GUI, then the MAC .dmg file is a good option (although sometimes it can be behind the latest code). Otherwise, we reocmmend either Docker or the code. ### 1. From pre-build MAC image file 1. Download the MAC application `.dmg` file from [here](https://sites.google.com/eng.ucsd.edu/datasets/alignment/pastaupp). Use the lastest version available 2. Open the .dmg file and copy its content to your preferred destination (do not run PASTA from the image itself). 3. Simply run the PASTA app from where you copied it. If the APP does not work, let us know. We will try to fix issues. But you can also try first installing PASTA from the source code (see below) and then run `./make-app.sh` from the pasta directory. This will create an app under `dist` directory, which you should be able to run and copy to any other location. ### 2. From Source Code The current version of PASTA has been developed and tested entirely on Linux and MAC. Windows won't work currently (future versions may or may not support Windows). You need to have: - [Python](https://www.python.org) (version 2.7 or later, including python 3) - [Dendropy](http://packages.python.org/DendroPy/) (but the setup script should automatically install dendropy for you if you don't have it) - [Java](https://www.java.com) (only required for using OPAL) - [wxPython](http://www.wxpython.org/) - only required if you want to use the GUI. The setup script does not automatically install this. **Installation steps**: 1. Open a terminal and create a directory where you want to keep PASTA and go to this directory. For example: ```bash mkdir ~/pasta-code cd ~/pasta-code` ``` 2. Clone the PASTA code repository from our [github repository](https://github.com/smirarab/pasta). For example you can use: ```bash git clone https://github.com/smirarab/pasta.git ``` If you don't have git, you can directly download a [zip file from the repository](https://github.com/smirarab/pasta/archive/master.zip) and decompress it into your desired directory. 3. A. Clone the relevant "tools" directory (these are also forked from the SATé project). There are different repositories for [linux](https://github.com/smirarab/sate-tools-linux) and [MAC](https://github.com/smirarab/sate-tools-mac). You can use ```bash git clone https://github.com/smirarab/sate-tools-linux.git #for Linux ``` or ```bash git clone https://github.com/smirarab/sate-tools-mac.git. #for MAC ``` Or you can directly download these as zip files for [Linux](https://github.com/smirarab/sate-tools-linux/archive/master.zip) or [MAC](https://github.com/smirarab/sate-tools-mac/archive/master.zip) and decompress them in your target directory (e.g. `pasta-code`). * Note that the tools directory and the PASTA code directory should be under the same parent directory. * When you use the zip files instead of using `git`, after decompressing the zip file you may get a directory called `sate-tools-mac-master` or `sate-tools-linux-master` instead of `sate-tools-mac` or `sate-tools-linux`. You need to rename these directories and remove the `-master` part. * Those with 32-bit Linux machines need to be aware that the master branch has 64-bit binaries. 32-bit binaries are provided in the `32bit` branch of `sate-tools-linux` git project (so download [this zip file](https://github.com/smirarab/sate-tools-linux/archive/32bit.zip) instead). 3. B. (Optional) Only if you want to use MAFFT-Homologs within PASTA: `cd sate-tools-linux` or `cd sate-tools-mac` Use `git clone https://github.com/koditaraszka/pasta-databases` or download directly at `https://github.com/koditaraszka/pasta-databases.git` * Be sure to leave this directory `cd ..` before starting the next step 4. `cd pasta` (or `cd pasta-master` if you used the zip file instead of clonning the git repository) 5. Then run: ``` bash sudo python setup.py develop ``` If you don't have root access, use: ``` bash python setup.py develop --user ``` **Common Problems:** * `Could not find SATé tools bundle directory`: this means you don't have the right tools directory at the right location. Maybe you downloaded MAC instead of Linux? Or, maybe you didn't put the directory in the parent directory of where pasta code is? Most likely, you used the zip files and forgot to remove teh `-master` from the directory name. Run `mv sate-tools-mac-master sate-tools-mac` on MAC or `mv sate-tools-linux-master sate-tools-linux` to fix this issue. * The `setup.py` script is supposed to install setuptools for you if you don't have it. This sometimes works and sometimes doesn't. If you get an error with a message like ` invalid command 'develop'`, it means that setuptools is not installed. To solve this issue, you can manually install [setup tools](https://pypi.python.org/pypi/setuptools#installation-instructions). For example, on Linux, you can run `curl https://bootstrap.pypa.io/ez_setup.py -o - | sudo python` (but note there are other ways of installing setuptools as well). 6. Pasta now includes additional aligners for Linux and MAC users: mafft-ginsi, mafft-homologs, contralign (version 1), and probcons. In order to use mafft-homologs and contralign, the user must set the environment variable `CONTRALIGN_DIR=/dir/to/sate-tools-linux`. You can use `export CONTRALIGN_DIR=/dir/to/sate-tools-linux` or just edit `~/.bashrc` to have `CONTRALIGN_DIR=dir/to/sate-tools-linux`. * To use these aligners, add the following to you
评论
    相关推荐
    • K均值聚类算法
      模式识别中常用的聚类算法,采用控制台程序,输出聚类结果
    • 聚类算法程序
      聚类算法中的感知器算法,用于模式识别中,有比较完整的算法描述
    • 谱聚类 聚类算法
      谱聚类 聚类算法 spectralClustering
    • k均值聚类算法
      模式识别作业之k均值聚类算法
    • 图像聚类算法图像聚类算法
      图像聚类算法图像聚类算法图像聚类算法图像聚类算法图像聚类算法图像聚类算法
    • Ncut聚类算法
      Ncut聚类算法,可以直接运行,有例子,程序有注释。
    • DBSCAN聚类算法
      经典DBASCAN聚类算法,适合新手小白学习,提供了数据可出效果图
    • ap聚类算法
      一个很好的利用DTW距离作相似度的ap聚类算法,同时能够自适应的调整最佳参数
    • DBSCAN聚类算法
      利用经典的基于密度的聚类算法,将四线激光雷达采集的数据进行聚类,剔除干扰点
    • AP聚类算法代码
      给出了AP聚类算法的实现代码,并给出了一个对二维坐标点进行聚类的实际例子的聚类结果。