What can I do with trimAl?

Trimming Methods

Meta Automated Methods

Meta Automated methods analyze your alignment to decide between automated statistical methods.
Currently, only Automated1 is available, and it will analyze your MSA and select between Strict and Gappyout

Method Description Example of use
Automated1 Chooses between Strict and Gappyout
Optimized for Maximum Likelihood phylogenetic tree reconstruction
bin/trimal -in dataset/example.007.AA.fasta -automated1

Automated Statistical Methods

Automated Statistical methods will search for optimum statistic thresholds to clean your alignment using statistics distribution.

Method Description Example of use
Strict Uses Gaps and Similarity statistics to clean the alignment. bin/trimal -in dataset/example.007.AA.fasta -strict
Strictplus Uses Gaps and Similarity statistics to clean the alignment.
Optimized for Neighbour Joining phylogenetic tree reconstruction
bin/trimal -in dataset/example.007.AA.fasta -strictplus
Gappyout Uses Gaps statistic to clean the alignment. bin/trimal -in dataset/example.007.AA.fasta -gappyout

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 147 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 147 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 174 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values

Manual Threshold Methods

Manual Threshold methods allow the user to provide thresholds and windows for each statistic available.
It also allows the user to obtain the raw values for each statistic, per column and accumulative.

Statistic
Name
Threshold
Argument
Window
Argument
Statistic
Per Colum
Argument
Statistic
Accumulative
Argument
Example
of use
Gaps -gt <n> [0 - 1] -gw <n> [0 - 1/4*N] -sgc -sgt bin/trimal -in dataset/example.007.AA.fasta -gt 0.5 -gw 2
Similarity -st <n> [0 - 1] -sw <n> [0 - 1/4*N] -ssc -sst bin/trimal -in dataset/example.007.AA.fasta -st 0.5 -sw 2
Consistency -ct <n> [0 - 1] -cw <n> [0 - 1/4*N] -sfc -sft bin/trimal -compareset data -ct 0.5 -cw 2

N = Number of residues in the input alignment

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 176 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 68 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 49 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

Manual Overlap Threshold Methods

Manual overlap allows to remove sequences based on two values: Sequence Overlap and Residue Overlap.

  • Sequence overlap indicates the minimum percentage of kept residues to keep a sequence.
  • Residue overlap indicates the minimum percentage of overlap (same residue on same position on the rest of sequences) to keep a residue.
Overlap Level Command line argument
Sequence -seqoverlap <n> [0-100]
Residue -resoverlap <n> [0-1]

../dataset/example.010.AA.fasta; Selected Sequences 8 / 9 Selected Residues 183 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

../dataset/example.010.AA.fasta; Selected Sequences 6 / 9 Selected Residues 182 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

Manual Methods

Manual methods allow the user to provide a pattern to remove residues or sequences.

Method Command line argument Example of use
Sequences -selectseqs { pattern } bin/trimal -in dataset/example.007.AA.fasta -selectseqs { 0,4 }
Residues -selectcols { pattern } bin/trimal -in dataset/example.007.AA.fasta -selectcols { 1-4 }

The user must provide a pattern after the argument.
This pattern is a set of comma separated values:
 
 bin/trimal -in dataset/example.010.AA.fasta -selectseqs { 0,1,2 } 
 
This would delete sequences 0, 1, and 2
Instead of providing raw position IDs, the user can also provide ranges, by separating start and end with a hyphen:
 
 bin/trimal -in dataset/example.010.AA.fasta -selectseqs { 0-2 }
 
This would delete sequences 0, 1, and 2
../dataset/example.010.AA.fasta; Selected Sequences 6 / 9 Selected Residues 183 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---
Combination of both separators are allowed:
 
 bin/trimal -in dataset/example.010.AA.fasta -selectcols { 0,4-6,12,16-18 }
 
This would delete sequences 0, 4, 5, 6, 12, 16, 17 and 18
Meaning the call above is synonymous to the following:
 
 bin/trimal -in dataset/example.010.AA.fasta -selectcols { 0,4,5,6,12,16,17,18 }
 
This would delete columns 0, 4, 5, 6, 12, 16, 17 and 18
../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 177 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---
    Combination of both are also allowed:
../dataset/example.010.AA.fasta; Selected Sequences 6 / 9 Selected Residues 175 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

Automated Methods

Automated methods will remove columns based on raw gaps values.

Method Description Example of use
NoGaps Removes any column containing at least, one gap bin/trimal -in dataset/example.007.AA.fasta -nogaps
NoAllGaps Removes columns containing only gaps bin/trimal -in dataset/example.007.AA.fasta -noallgaps

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 139 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values

../dataset/example.010b.AA.fasta; Selected Sequences 8 / 8 Selected Residues 183 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values 120 130 140 150 160 170 180 | | | | | | | Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values

Trimming Tweaks

Custom Matrix

Similarity values are obtained using a reference matrix
While trimAl contains matrices for each sequence type, the user can provide their own matrices.

Examples of these matrices can be found on the dataset folder:

matrix.BLOSUM62  
a R N D C Q E G H I L k M F P S T W Y V
A +4 -1 -2 -2 +0 -1 -1 +0 -2 -1 -1 -1 -1 -2 -1 +1 +0 -3 -2 +0
R -1 +5 +0 -2 -3 +1 +0 -2 +0 -3 -2 +2 -1 -3 -2 -1 -1 -3 -2 -3
N -2 +0 +6 +1 -3 +0 +0 +0 +1 -3 -3 +0 -2 -3 -2 +1 +0 -4 -2 -3
D -2 -2 +1 +6 -3 +0 +2 -1 -1 -3 -4 -1 -3 -3 -1 +0 -1 -4 -3 -3
C +0 -3 -3 -3 +9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1
Q -1 +1 +0 +0 -3 +5 +2 -2 +0 -3 -2 +1 +0 -3 -1 +0 -1 -2 -1 -2
E -1 +0 +0 +2 -4 +2 +5 -2 +0 -3 -3 +1 -2 -3 -1 +0 -1 -3 -2 -2
G +0 -2 +0 -1 -3 -2 -2 +6 -2 -4 -4 -2 -3 -3 -2 +0 -2 -2 -3 -3
H -2 +0 +1 -1 -3 +0 +0 -2 +8 -3 -3 -1 -2 -1 -2 -1 -2 -2 +2 -3
I -1 -3 -3 -3 -1 -3 -3 -4 -3 +4 +2 -3 +1 +0 -3 -2 -1 -3 -1 +3
L -1 -2 -3 -4 -1 -2 -3 -4 -3 +2 +4 -2 +2 +0 -3 -2 -1 -2 -1 +1
K -1 +2 +0 -1 -3 +1 +1 -2 -1 -3 -2 +5 -1 -3 -1 +0 -1 -3 -2 -2
M -1 -1 -2 -3 -1 +0 -2 -3 -2 +1 +2 -1 +5 +0 -2 -1 -1 -1 -1 +1
F -2 -3 -3 -3 -2 -3 -3 -3 -1 +0 +0 -3 +0 +6 -4 -2 -2 +1 +3 -1
P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 +7 -1 -1 -4 -3 -2
S +1 -1 +1 +0 -1 +0 +0 +0 -1 -2 -2 +0 -1 -2 -1 +4 +1 -3 -2 -2
T +0 -1 +0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 +1 +5 -2 -2 +0
W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 +1 -4 -3 -2 +1 +2 -3
Y -2 -2 -2 -3 -2 -1 -2 -3 +2 -1 -1 -2 -1 +3 -3 -2 -2 +2 +7 -1
V +0 -3 -3 -3 -1 -2 -2 -3 -3 +3 +1 -2 +1 -1 -2 -2 +0 -3 -1 +4

matrix.Degenerated_DNA  
A C D G K M S R T W Y
A 1 0 0 0 0 0 0 0 0 0 0
C 0 1 0 0 0 0 0 0 0 0 0
D 0 0 1 0 0 0 0 0 0 0 0
G 0 0 0 1 0 0 0 0 0 0 0
K 0 0 0 0 1 0 0 0 0 0 0
M 0 0 0 0 0 1 0 0 0 0 0
S 0 0 0 0 0 0 1 0 0 0 0
R 0 0 0 0 0 0 0 1 0 0 0
T 0 0 0 0 0 0 0 0 1 0 0
W 0 0 0 0 0 0 0 0 0 1 0
Y 0 0 0 0 0 0 0 0 0 0 1

Complementary

To extend the capability of trimAl, it is possible to obtain the complementary alignment.
This means, flipping the values of residues to keep/reject.

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 176 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 9 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values

Terminal Only

Terminal Only option allows you to only trim the alignment outside the central block.
Central Block is defined as the block which contains
the center of the alignments and doesn't contain any gap inside.
Considering the Central Block as the best alignment section of an alignment,
due to be the one being built with no gaps,
we can trim the alignment keeping this Central Block untouched.

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 49 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 146 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

Block

Block argument allows to clean the alignment using different methods,
and later keep only residues that fall inside a keep-residues block of
the specified number of residues.
This allows us to remove columns that obtained good scores by random chance.

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 150 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 147 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 147 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

Clusters

It is possible to apply a clusterization of the sequences,
to keep only the most representative sequences of the variability
of the whole alignment.

It will output a alignment with the most representative sequence
for each cluster.

The user must provide the number of clusters to create, which is the
same as the number of sequences present on the final alignment;

../dataset/example.010.AA.fasta; Selected Sequences 2 / 9 Selected Residues 185 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

../dataset/example.010.AA.fasta; Selected Sequences 4 / 9 Selected Residues 185 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

../dataset/example.010.AA.fasta; Selected Sequences 6 / 9 Selected Residues 185 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

Max Identity

Max Identity allows us to remove sequences with an identity
bigger than a user-provided threshold.

This removes the sequences that are too similar.

../dataset/example.010.AA.fasta; Selected Sequences 1 / 9 Selected Residues 183 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

../dataset/example.010.AA.fasta; Selected Sequences 6 / 9 Selected Residues 185 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS---

Compareset

Compareset allows the user to provide multipe MSA and to obtain Consistency scores.
These scores can be used to select the most consistent alignment in the set provided
or to use them as a trimming statistics to remove columns that are not consistent enough
among the set.

While Compareset will select the most consistent alignment, it is possible to force it to
select an specific alignment by using the -forceselect argument.

When an output file is provided, a summary of the compareset results is provided.

> cat alignments_comparison  
dataset/example.010.a.AA.fasta
dataset/example.010.b.AA.fasta
dataset/example.010.c.AA.fasta

Summary chart  
File:           dataset/example.010.a.AA.fasta
Values:         Sequences: 8    Residues: 187   Pond. Hits:  133.881    %Consistency: 0.715941

File:           dataset/example.010.b.AA.fasta
Values:         Sequences: 8    Residues: 188   Pond. Hits:   120.27    %Consistency: 0.639735

File:           dataset/example.010.c.AA.fasta
Values:         Sequences: 8    Residues: 188   Pond. Hits:  122.744    %Consistency: 0.652893
                                        --------------

File Selected:  dataset/example.010.a.AA.fasta
Value:          0.715941
../dataset/example.010.c.AA.fasta; Selected Sequences 8 / 8 Selected Residues 81 / 188 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Xtr21234 ----MISQVRQNYSHDCEAAVNRMVN-LEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQ--AL LcaH ----MSSQVRQNFHQDCEAAINRQIN-LELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQ--SL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQIN-LELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNV--NQSL Mmu024661 MTTASPSQVRQNYH-QDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSV--NQSL Dre37936 ---METSQIRQNYV-RDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQ--LEKNVNQAL LcaM ----MESQVRQNYHRDCEAAVNR-MVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG----SGLEAMQCALQLKKNVNQAL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRD-DVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG----SGLEAMQCALQLEKKVNQAL Ola20972 ----MESQVRQNYHRDC-EAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG----SGLEAMQCALQLEKNVNQAL 1 0 Cons Values 120 130 140 150 160 170 180 | | | | | | | Xtr21234 LDLHNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH LDLHKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 LELHKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 LELHKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 LDLHKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM LDLHKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 LDLHKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 LDLHKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Cons Values

Minimum Alignment Conservation Threshold

It is possible, depending on the MSA provided and the trimming methods selected
to trim the MSA in a excessively aggresive way.

This may lead to a MSA with few or no residues.

To solve this issue, one of the solutions proposed is to conserve
a minimum percentage of the alignment.

To do so, the alignment will start on the center of the alignment,
which is supposed to be better aligned than the terminal ends, and
starts recovering residues in both directions until the desired
Minimum Alignment Conservation Threshold has been achieved.

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 68 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 74 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 111 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 148 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values

Other Tweaks

Column Numbering

Outputs the relationship between residues in the output and input alignment.
It will output a -1 if the residue was removed, and the original residue position otherwise.

../dataset/example.010.AA.fasta; Selected Sequences 9 / 9 Selected Residues 147 / 185 Hidrophobic Aromatic Polar Negative Charge Glycines Prolines Positive Charge Unconserved Prolines 0 10 20 30 40 50 60 70 80 90 100 110 | | | | | | | | | | | | Csa004271 ---------------------------------MYMAMGHFFDRDDVALKNISEYFKECSEEEREHANKMIEFHNKRGGTTTYFPIKAPGSFDPANFNTIKAMNCALALEVNVNKSLLAL Xtr21234 ----MISQVRQNYSHDCEAAVNRMVNLEMYASYTYLSMSHYFDRDDVALHHVAEFFKEQSKEERECAEKLMKCQNKRGGRIVLQDIKKPERDEWG--STLDAMQTALDLEKHVNQALLDL LcaH ----MSSQVRQNFHQDCEAAINRQINLELYASYVYLSMAYYFDRDDQALHNFAKFFRHQSHEEREHAEKLMKLQNQRGGRIFLQDVRKPDRDEWG--SGVEALECALQLEKSVNQSLLDL Hsa167996 MTTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDCDDWE--SGLNAMECALHLEKNVNQSLLEL Mmu024661 MTTASPSQVRQNYHQDAEAAINRQINLELYASYVYLSMSCYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRIFLQDIKKPDRDDWE--SGLNAMECALHLEKSVNQSLLEL Dre37936 ---METSQIRQNYVRDCEAAINKMINLELYAGYTYTSMAHYFKRDDVALPGFAKFFKKNSEEEREHAEKFMEFQNKRGGRIVLQDIKKPDRDVWG--NGLIAMQCALQLEKNVNQALLDL LcaM ----MESQVRQNYHRDCEAAVNRMVNMEMFASYTYTSMAFYFSRDDVALPGFSHFFKENSDEEREHAEKLLSFQNKRGGHIFLQDIKKPERDEWG--SGLEAMQCALQLKKNVNQALLDL Tru14292 ----MESQVRQNYHRDCEAAINKMINMELYASYTYTSMAFFFSRDDVALPGFAHFFKENSDEEREHAEKLLSFQNKRGGRIFLQDIKKPERDEWG--SGLEAMQCALQLEKKVNQALLDL Ola20972 ----MESQVRQNYHRDCEAAINRMVNMELFASYTYTSMAFYFDRDDVALPGFSHFFKENSHEEKEHADKLLSFQNKRGGRIFLQDVKKPERDEWG--SGLEAMQCALQLEKNVNQALLDL 1 0 Gaps Values Sim Values 120 130 140 150 160 170 180 | | | | | | | Csa004271 HE--TANGDPEFQDFIEANFLHEQVDAIKKLKDYITNLKLVG---TGLGEFLFDKHFKSS----- Xtr21234 HNLATERKDPHICDFLESEHLDEQVKHMKKFGDHITNLKRLGVPQNGMGEYLFDKHSLS------ LcaH HKLCSDHNDPHLCDFIETHYLDEQVKSIKELADWVTNLRRMGAPQNGMAEYLFDKHTLGKES--S Hsa167996 HKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGAPESGLAEYLFDKHTLGDSDNES Mmu024661 HKLATDKNDPHLCDFIETYYLSEQVKSIKELGDHVTNLRKMGAPEAGMAEYLFDKHTLGHGD-ES Dre37936 HKLATEMGDPHLCDFLETHYLNEQVEAIKKLGDHITNLSKMDAGNNRMAEYLFDKHTLDS----- LcaM HKLASDHGDPHLCDFLETHYLNEQVEAIKKLGDYISNLSRMDAQKNKMAEYLFDKHSLGGKS--- Tru14292 HKLASDHVDPHLCDFLESHYLNEQVEAIKKLGDYITNLSRMDAQNNKMAEYLFDKHTLGSKS--- Ola20972 HKVASDHKDPHMCDFLETHYLNEQVESIKKIGDHITNLTRMDAHTNKMAEYLFDKHTLGSKS--- 1 0 Gaps Values Sim Values
  #ColumnsMap	-1, -1, -1, -1, -1, -1, 6, 7, 8, 9, 10, 11, 12, 13, -1, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, -1, -1, -1, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, -1, -1, -1, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, -1, -1, -1, -1, -1, -1, -1, -1, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, -1, -1, -1, -1, -1, -1, -1