OMPU

所属分类:人工智能/神经网络/深度学习
开发工具:C/C++
文件大小:666KB
下载次数:1
上传日期:2018-03-14 09:22:22
上 传 者LLIVsjv_541
说明:  多重序列比对源代码,即所谓的clustral算法,采用两两比对的思想

文件列表:
clustalv.doc (77555, 2000-06-07)
clustalw.doc (34148, 2000-06-07)
clustalx.html (73184, 2000-06-07)
clustalw.h (7838, 2000-06-05)
dayhoff.h (2699, 2000-06-05)
general.h (1125, 2000-06-05)
matrices.h (40730, 2000-06-05)
ncbirc.h (93, 1997-11-07)
param.h (10769, 2000-06-05)
xmenu.h (7674, 2000-06-05)
calcprf2.c (1785, 2000-06-05)
calctree.c (22698, 2000-06-05)
ccalcgapcoeff.c (14651, 2000-06-05)
clustalw.c (2515, 2000-06-05)
clustalx.c (2934, 2000-06-05)
gcgcheck.c (320, 2000-06-05)
hcalcprf1.c (3085, 2000-06-05)
interface.c (109811, 2000-06-05)
malign.c (17781, 2000-06-05)
pairalign.c (15450, 2000-06-05)
prfalign.c (32370, 2000-06-05)
random.c (1727, 2000-06-05)
readmat.c (10990, 2000-06-05)
readtree.c (20367, 2000-06-05)
sequence.c (35475, 2000-06-05)
showpair.c (10344, 2000-06-05)
tamenu.c (40118, 2000-06-05)
trees.c (44920, 2000-06-05)
util.c (8269, 2000-06-05)
Xalnscore.c (2405, 2000-06-05)
xcolor.c (30522, 2000-06-05)
xdisplay.c (55950, 2000-06-05)
xmenu.c (122249, 2000-06-05)
xscore.c (25759, 2000-06-05)
xutils.c (26449, 2000-06-05)
clustalx.hlp (64103, 2000-06-07)
clustalx.exe (478720, 2005-02-21)
NJPLOT.EXE (220842, 2005-02-21)
njplotWIN95.exe (220634, 2005-02-21)
... ...

****************************************************************************** CLUSTAL X Multiple Sequence Alignment Program (version 1.81, March 2000) ****************************************************************************** This README contains notes on version CHANGES and help with INSTALLATION Clustal X provides a new window-based user interface to the Clustal W multiple alignment program. It uses the Vibrant multi-platform user interface development library, developed by the National Center for Biotechnology Information (Bldg 38A, NIH 8600 Rockville Pike,Bethesda, MD 20894) as part of their NCBI SOFTWARE DEVELOPEMENT TOOLKIT. The toolkit is available by anonymous ftp from ncbi.nlm.nih.gov Please e-mail bug reports/complaints/suggestions (polite if possible) to Julie Thompson at julie@igbmc.u-strasbg.fr or Toby Gibson at gibson@embl-heidelberg.de ****************************************************************************** POLICY ON COMMERCIAL DISTRIBUTION OF CLUSTAL W and X Clustal W and X are freely available to the user community. However, Clustal W is increasingly being distributed as part of commercial sequence analysis packages. To help us safeguard future maintenance and development, commercial distributors of Clustal X must take out a non-exclusive licence. Anyone wishing to commercially distribute version 1.81 of Clustal X should contact the authors unless they have previously taken out a licence. ****************************************************************************** Changes since CLUSTAL X Version 1.8 ----------------------------------- 1. ClustalX now returns error codes for some common errors when exiting. This may be useful for people who run clustalx automatically from within a script. Error codes are: 1 bad command line option 2 cannot open sequence file 3 wrong format in sequence file 4 sequence file contains only 1 sequence (for multiple alignments) 2. Alignments can now be saved in Nexus format, for compatibility with PAUP, MacClade etc. For a description of the Nexus format, see: Maddison, D. R., D. L. Swofford and W. P. Maddison. 1997. NEXUS: an extensible file format for systematic information. Systematic Biology 46:590-621. 3. Phylogenetic trees can also be saved in nexus format. 4. A bug causing ClustalX to crash during cut-and-paste operations has been fixed. 5. A bug on PC systems, causing an error message when writing to files with space characters in the filename has been fixed. 6. The Quality Curve is now displayed as a bar chart, instead of a line plot. (Thanks to Michele Clamp, michele@ebi.ac.uk, who used this format in the JalView editor.) 7. A bug in the 'Save Profile' option, causing the default profile filename to be lost has been fixed. 8. A ClustalX icon has been designed for MAC and PC systems. Changes since CLUSTAL X Version 1.65b ------------------------------------- 1. Some work has been done to automatically select the optimal parameters depending on the set of sequences to be aligned. The Gonnet series of residue comparison matrices are now used by default. The Blosum series remains as an option. The default gap extension penalty for proteins has been changed to 0.2 (was 0.05).The 'delay divergent sequences' option has been changed to 30% residue identity (was 40%). 2. The default parameters used when the 'Negative matrix' option is selected have been optimised. This option may help when the sequences to be aligned are not superposable over their whole lengths (e.g. in the presence of N/C terminal extensions). 3. An option has been added to save the quality scores displayed underneath the sequence window to a text file. 4. The 'Hide Low-scoring segments' option has been moved from the Low-scoring parameter window to the Quality menu, and has been changed to 'Show Low-scoring segments'. 5. An option has been added to allow the user to search for a string in the sequences. 6. An option has been added to the postscript output to print on US Letter size paper. 7. A bug in the display of the message at the bottom of the window causing the text to disappear when the window was resized has been fixed. 8. The font for the Help window as been changed to Courier. 9. A bug in the calculation of phylogenetic trees for 2 sequences has been fixed. 10. A command line option has been added to turn off the sequence weighting calculation. 11. The phylogenetic tree calculation now ignores any ambiguity codes in the sequences. 12. A bug in the memory access during the calculation of profiles has been fixed. (Thanks to Haruna Cofer at SGI). 13. A bug has been fixed in the 'transition weight' option for nucleic acid sequences. (Thanks to Chanan Rubin at Compugen). 14. An option has been added to allow the user to read in a series of residue comparison matrices from a file. 15. The MSF output file format has been changed. The sequence weights calculated by ClustalX are now included in the header. 16. Two bugs in the FAST/APPROXIMATE pairwise alignments have been fixed. One involved the alignment of new sequences to an existing profile using the fast pairwise alignment option; the second was caused by changing the default options for the fast pairwise alignments. 17. A bug in the alignment of a small number of sequences has been fixed. Previously a Guide Tree was not calculated for less than 4 sequences. 18. Several bugs affecting use of secondary structure masks in Clustal X (but not in Clustal W) have been fixed. Changes since Version 1.5b -------------------------- 1. The window displayed under MS Windows has previously been a fixed size. The window can now be resized by dragging the window frame. 2. An option has been added to read in a series of comparison matrices from a file. This option is only applicable for protein sequences. For details of the file format, see the on-line documentation. 3. A new DNA comparison matrix has been added. This is the default scoring matrix used by BESTFIT for the comparison of nucleic acid sequences. X's and N's are treated as matches to any IUB ambiguity symbol. All matches score 1.9; all mismatches for IUB symbols score 0. The previous system used by ClustalW, in which matches score 1.0 and mismatches score 0 remains as an option. All matches for IUB symbols will also score 0. 4. You can now read a comparison matrix for DNA sequences from a file. The matrix file should be in the same format as for the Blast program. 5. The 'Reset gaps before alignment' has been changed to 'Reset new gaps before alignments'. A new option 'Reset ALL gaps before alignment' has been added. RESET NEW GAPS BEFORE ALIGNMENT will remove any new gaps introduced into the sequences during multiple alignment if you wish to change the parameters and try again. RESET ALL GAPS BEFORE ALIGNMENT will remove all gaps in the sequences including gaps which were read in from the sequence input file. 6. The 'Realign Residue Range' option has been changed. By default, gap opening and extension penalties are now applied to the ends of the alignment range in order to penalise terminal gaps. If the REALIGN SEGMENT END GAP PENALTIES option is switched off, gaps can be introduced at the ends of the residue range at no cost. 7. The MSF output file format has been changed. The sequence weights calculated by ClustalX are now included in the header. 8. Two bugs in the FAST/APPROXIMATE pairwise alignments have been fixed. One involved the alignment of new sequences to an existing profile using the fast pairwise alignment option; the second was caused by changing the default options for the fast pairwise alignments. 9. A bug in the postscript output file has been fixed. The residue numbers printed at the right hand side of the alignment were not always correct. 10. A bug in the alignment of a small number of sequences has been fixed. Previously a Guide Tree was not calculated for less than 4 sequences. 11. A bug which occurred after frequent cut-and-paste operations has been fixed. 12. A new file called clustalx.html contains an html'ised version of the on-line help. The file can be viewed using a World Wide Web viewer, such as Netscape. New Features since ClustalW --------------------------- 1. A subset of sequences in an alignment may be selected and realigned to a profile made from the unselected sequences. This may be useful when trying to align very divergent sequences which have been badly aligned in the initial full multiple alignment. 2. A range of the sequence alignment can be selected for realignment. A new phylogenetic guide tree is built based only on the residue range selected. The selected residues are then aligned, and pasted back into the full sequence alignment. This may be useful for aligning small sections of the alignment which have been badly aligned in the full sequence alignment, or which have a very different guide tree structure from the tree built using the full sequences. 3. Clustal X provides a versatile coloring scheme for the sequence alignment display. The sequences (or profiles) are colored automatically, when they are loaded. Sequences can be colored either by assigning a color to specific residues, or on the basis of an alignment consensus. In the latter case, the alignment consensus is calculated automatically, and the residues in each column are colored according to the consensus character assigned to the column. In this way, for example, conserved hydrophylic or hydrophobic positions can be highlighted. 4. An 'Alignment Quality Score' is plotted below the alignment. This is an estimate of the conservation of each column in the alignment. Highly conserved columns will have a high quality score, less conserved positions will be marked by a low score. 5. 'Exceptional' residues in the alignment that cause the low quality scores described above, can be highlighted. These can be expected to occur at a moderate frequency in all the sequences because of their steady divergence due to the natural processes of evolution. However, clustering of highlighted residues is a strong indication of misalignment. Occasionally, highlighted residues may also point to regions of some biological significance. 6. Low-scoring segments in the alignment can be highlighted. The segments are defined as those regions which score negatively in a forward and backward summation of the alignment profile scores. See the online help for more details. 7. The new GCG9 MSF,RSF formats are now recognised as input formats for clustalx. The alignments cannot be written out in these formats however. The code has been tested on UNIX (SGI, SUN, DIGITAL) and Macintosh. Compiled executables are provided for these systems. If you wish to recompile the source files, you will first need to install the NCBI toolkit on your machine. Then, to compile the program on UNIX, edit the makefile to point to your NCBI include and library files, and type: make -f makefile.sun or make -f makefile.sgi or make -f makefile.osf To run the program, type clustalx. A window is displayed with a pull-down menu bar which allow all functions to be selected and all alignment parameters may be modified, if desired. Documentation for ClustalW (clustalw.doc) is included in the directory. Online help is also available for most options of Clustal X by selecting HELP from the menu bar. Help is also available on the WWW at www-igbmc.u-strasbg.fr/BioInfo/ClustalX/ www-igbmc.u-strasbg.fr/BioInfo/ClustalW/ www.U.arizona.edu/~schluter/ClustalW/index.html INSTALLATION (for Unix, PC and MAC) ------------ UNIX ---- Executables are provided in the appropriate archives for Digital UNIX 4.0 on Alphas, Sun OS 5.6, Silicon Graphics IRIX 6.2 and LINUX (libc6 must be installed). If you wish to run on another platform, you will need to recompile Clustal X for yourself. The executable file clustalx should be copied to one of the directories specified in your PATH environment variable. The files called *.par and clustalx_help should also be copied to the same directory. Recompiling ClustalX: First of all, you need the NCBI Vibrant toolkit installed on your machine. If this is not already done, you can get the toolkit by anonymous ftp to ncbi.nlm.nih.gov. You should then copy one of the makefiles supplied in the unix archives to 'makefile' and edit it, changing the NCBI_INC and NCBI_LIB paths for your system. You make the program with: make -f makefile This produces the executable file clustalx. You can then proceed with the installation as described above. MS WINDOWS ---------- We supply an executable file (clustalx.exe) which will run under MS Windows (32 bit). The directory containing the executable (plus the files named *.par, and clustalx.hlp) should be added to your path defined in the autoexec.bat file. Recompiling ClustalX: First of all, you need the NCBI Vibrant toolkit installed on your machine. If this is not already done, you can get the toolkit by anonymous ftp to ncbi.nlm.nih.gov. A makefile is supplied which can be used as a guide for recompiling the ClustalX source code. You will need to edit it for your system. In particular the NCBI_INC and NCBI_LIB paths should point to your installation. MAC --- An executable program called clustalx is supplied for Power Macintoshes. For 68K machines, you will need to recompile the code yourself. The program may need up to 10m of memory to run depending on the number and length of your sequences. The memory allocation can be adjusted with the Get Info (%I) command from the Finder if you have problems. Just double click the executable file name or icon and off you go (we hope). The files *.par and clustalx_help should be stored in the same directory as the clustalx program. Recompiling ClustalX: First of all, you need the NCBI Vibrant toolkit installed on your machine. If this is not already done, you can get the toolkit by anonymous ftp to ncbi.nlm.nih.gov. We used the Metroworks Codewarrior C compiler to compile the ClustalX files, but another ANSI C compiler should work. You need to compile all the *.c files supplied in the archive, then link them together with the NCBI Toolkit libraries 'ncbi' and 'vibrant'. CLUSTAL REFERENCES ------------------ Details of algorithms, implementation and useful tips on usage of Clustal programs can be found in the following publications: Jeanmougin,F., Thompson,J.D., Gouy,M., Higgins,D.G. and Gibson,T.J. (19***) Multiple sequence alignment with Clustal X. Trends Biochem Sci, 23, 403-5. Thompson,J.D., Gibson,T.J., Plewniak,F., Jeanmougin,F. and Higgins,D.G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 24:4876-4882. Higgins, D. G., Thompson, J. D. and Gibson, T. J. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol., 266, 383-402. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680. Higgins,D.G., Bleasby,A.J. and Fuchs,R. (1992) CLUSTAL V: improved software for multiple sequence alignment. CABIOS 8,189-191. Higgins,D.G. and Sharp,P.M. (1***9) Fast and sensitive multiple sequence alignments on a microcomputer. CABIOS 5,151-153. Higgins,D.G. and Sharp,P.M. (1***8) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73,237-244.

近期下载者

相关文件


收藏者