One of the main problems that the researchers have to face during their studies is the conversion among the various formats required by different programs.
Nowadays there around 20 different Multiple Sequence Alignment (MSAs) formats and subformats.
Here we present readAl, a tool for format alignment conversion among the most representative formats.
readAl has been implemented in C++ programming language.
This program is part of the trimAl package and is used internally by trimAl to convert among different formats.
The simplest way to compile this program is:
1.- Move to the project folder 2.- Configure the project: > cmake . 3.- Compile the project: > make > make readal (if you only want readal to be compiled) 4.- Move or copy binaries folder './bin/' to 'usr/local/bin' or 'usr/bin'
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, the last available version.
readal -in <inputfiles> -out <pattern> -format [formats] [options]. -h Show this information. -in <inputfiles> Input files in several formats. Separated by spaces. -out <pattern> Output file name (default STDOUT). It will replace the tags [in] -> Original filename without extension. [format] -> Output's format name [extension] -> Output's extension -formats Formats you want the output to be converted to. Available formats are CLUSTAL, FASTA, PIR, PHYLIP32, PHYLIP40, PHYLIP_PAML, NEXUS, MEGAI, MEGAS, HTML. Being the HTML format not a format itself, but a colored report of the alignment files. -format Print information about input file format and if sequences are aligned or not. -type Print information about biological sequences datatype (e.g. nucleotides:dna, nucleotides:rna, aminoacids, etc) -info Print information about sequences number, average sequence length, max & min sequence length -reverse Output the reverse of sequences in input file. -shortNames Shortens the names so they fit on certain formats -keepHeaders Keeps the headers of the original format if it had any
Take in mind that this arguments may be discontinued any time.
-onlyseqs Generate output with only residues from input file -html Output residues colored according their physicochemical properties. HTML file. -nbrf Output file in NBRF/PIR format -mega Output file in MEGA format -nexus Output file in NEXUS format -clustal Output file in CLUSTAL format -fasta Output file in FASTA format -fasta_m10 Output file in FASTA format. Sequences name up to 10 characters. -phylip Output file in PHYLIP/PHYLIP4 format -phylip_m10 Output file in PHYLIP/PHYLIP4 format. Sequences name up to 10 characters. -phylip_paml Output file in PHYLIP format compatible with PAML -phylip_paml_m10 Output file in PHYLIP format compatible with PAML. Sequences name up to 10 characters. -phylip3.2 Output file in PHYLIP3.2 format -phylip3.2_m10 Output file in PHYLIP3.2 format. Sequences name up to 10 characters.
readal -in ./dataset/AA1.fas -out ./dataset/[in].output.[extension] -formats clustal -> Will produce ./dataset/AA1.output.clw readal -in ./dataset/example1.clw -out ./dataset/[in].[format].[extension] -formats fasta phylip32 phylip40 -> Will produce ./dataset/example1.FASTA.fasta ./dataset/example1.PHYLIP32.phy ./dataset/example1.PHYLIP40.phy readal -in ./dataset/example1.clw -out ./dataset/[in]/[format].[extension] -formats fasta phylip32 phylip40 -> Will produce ./dataset/example1/FASTA.fasta ./dataset/example1/PHYLIP32.phy ./dataset/example1/PHYLIP40.phy ONLY if ./dataset/example1/ already exists. readal -in ./dataset/AA1.fas ./dataset/AA2.fas -out ./dataset/[in].output.[extension] -formats clustal pir -> Will produce ./dataset/AA1.output.clw ./dataset/AA2.output.clw ./dataset/AA1.output.pir ./dataset/AA2.output.pir readal -in ./dataset/AA1.fas -format -type -info -> Will produce terminal output giving information about AA1.fas alignment file readal -in ./dataset/AA1.fas ./dataset/AA2.fas -out ./dataset/[in].output.[extension] -formats html -> Will produce ./dataset/AA1.output.html ./dataset/AA2.output.html Those files are not indeed reformats of the original alignments, but an HTML colored report of the alignment file.