Class to calculate the consistency between several MSA containing the same sequences, differently aligned.
Using this statistics, the class is able to select the most consistent alignment between all alignments provided.
It is possible to forcefully select an alignment, but to calculate the statistics for latter use.
After selecting an alignment (most consistent or manually selected), it is possible to use this statistic to trim the alignment, removing columns that are not consistent enough with the other alignments.
More...
#include <Consistency.h>
Public Member Functions | |
bool | perform (char *comparesetFilePath, FormatHandling::FormatManager &formatManager, trimAlManager &manager, char *forceFile) |
Method to compare a set of MSA, all containing the same sequences and residues. The number of residues must be the same, but gaps are not taken into account. This is due to the same sequence being aligned in different ways, which changes the gap patterns. More... | |
bool | applyWindow (int halfW) |
Applies a new window to the alignment. More... | |
Consistency (Alignment *pAlignment, Consistency *pConsistency) | |
Copy constructor. More... | |
Consistency () | |
Default Construtor. More... | |
~Consistency () | |
Default Destructor. More... | |
float * | getValues () |
Stat Getter . More... | |
Static Public Member Functions | |
static void | printStatisticsFileColumns (Alignment &alig, float *compareVect) |
Print the consistency value for each column from the selected alignment. More... | |
static void | printStatisticsFileAcl (Alignment &alig, float *compareVect) |
Print the accumulated consistency value from the selected alignment. More... | |
Private Member Functions | |
bool | isWindowDefined () |
Method to check wether or not a window has been applied. More... | |
Static Private Member Functions | |
static int | compareAndChoose (Alignment **vectAlignments, char **fileNames, float *columnsValue, int numAlignments, bool verbosity) |
Method to compare a set of alignments to select the most consistent one respect the others. To compute the consistency values we use the proportion of residue pairs per columns in the alignments to compare. More... | |
static bool | forceComparison (Alignment **vectAlignments, int numAlignments, Alignment *selected, float *columnsValue) |
Method to obtain the consistency values vector for a given alignment against a set of alignments with the same sequences. More... | |
Private Attributes | |
Alignment * | alig = nullptr |
Original alignment for which the stat was calculated. More... | |
Alignment ** | compareAlignmentsArray = nullptr |
Array of alignments to compare. More... | |
float * | values = nullptr |
Raw consistency values. More... | |
float * | values_windowed = nullptr |
Windowed consistency values. More... | |
int | numFiles = 0 |
Number of files to compare. More... | |
int | i = 0 |
Temporary variable used on loops. More... | |
int | maxResidues = 0 |
Maximum number of residues on the whole dataset. More... | |
int | halfWindow = -1 |
Variable to store the type of the alignment from the last alignment. More... | |
int | residues = -1 |
Number of residues of the selected alignment. More... | |
int * | refCounter |
Counter of how many statisticsConsistency share the same MDK values. More... | |
bool | appearErrors = false |
Intermediate variable to keep track of the progress status. More... | |
Class to calculate the consistency between several MSA containing the same sequences, differently aligned.
Using this statistics, the class is able to select the most consistent alignment between all alignments provided.
It is possible to forcefully select an alignment, but to calculate the statistics for latter use.
After selecting an alignment (most consistent or manually selected), it is possible to use this statistic to trim the alignment, removing columns that are not consistent enough with the other alignments.
Definition at line 58 of file Consistency.h.
statistics::Consistency::Consistency | ( | Alignment * | pAlignment, |
Consistency * | pConsistency | ||
) |
Copy constructor.
Definition at line 829 of file Consistency.cpp.
References alig, refCounter, values, and values_windowed.
Referenced by statistics::Manager::Manager().
statistics::Consistency::Consistency | ( | ) |
Default Construtor.
Definition at line 839 of file Consistency.cpp.
References refCounter.
Referenced by trimAlManager::performCompareset().
statistics::Consistency::~Consistency | ( | ) |
Default Destructor.
Definition at line 820 of file Consistency.cpp.
References alig, refCounter, values, and values_windowed.
bool statistics::Consistency::applyWindow | ( | int | halfW | ) |
Applies a new window to the alignment.
halfW | Half size of window to apply. |
Definition at line 525 of file Consistency.cpp.
References ConsistencyWindowTooBig, debug, halfWindow, reporting::reportManager::report(), residues, values, and values_windowed.
Referenced by getValues(), and perform().
|
staticprivate |
Method to compare a set of alignments to select the most consistent one respect the others.
To compute the consistency values we use the proportion of residue pairs per columns in the alignments to compare.
vectAlignments | Alignment vector to compare and select the most consistent. | |
fileNames | Vector containing all the filenames. Useful only if verbosity==True. | |
[out] | columnsValue | Consistency values of selected alignment. |
numAlignments | Number of alignments to compare. | |
verbosity | Wether or not report by printing some results. |
Definition at line 241 of file Consistency.cpp.
References debug, DifferentNumberOfSequencesInCompareset, DifferentSeqsNamesInCompareset, Alignment::sequencesMatrix::getColumn(), Alignment::getNumAminos(), Alignment::getNumSpecies(), Alignment::getSequenceNameOrder(), Alignment::getSequences(), utils::initlVect(), reporting::reportManager::report(), Alignment::SequencesMatrix, and Alignment::sequencesMatrix::setOrder().
Referenced by perform().
|
staticprivate |
Method to obtain the consistency values vector for a given alignment against a set of alignments with the same sequences.
vectAlignments | Alignment vector to compare against the selected alignment | |
numAlignments | Number of alignments to compare | |
selected | Alignment to compare against the set of alignments. | |
[out] | columnsValue | Vector to fill with the consistency values. |
Definition at line 416 of file Consistency.cpp.
References debug, DifferentNumberOfSequencesInCompareset, DifferentSeqsNamesInCompareset, Alignment::sequencesMatrix::getColumn(), Alignment::getNumAminos(), Alignment::getNumSpecies(), Alignment::getSequenceNameOrder(), Alignment::getSequences(), utils::initlVect(), reporting::reportManager::report(), Alignment::SequencesMatrix, Alignment::sequencesMatrix::sequencesMatrix(), and Alignment::sequencesMatrix::setOrder().
Referenced by perform().
float * statistics::Consistency::getValues | ( | ) |
Stat Getter
.
Definition at line 588 of file Consistency.cpp.
References applyWindow(), halfWindow, isWindowDefined(), values, and values_windowed.
Referenced by Alignment::alignmentSummaryHTML(), Alignment::alignmentSummarySVG(), trimAlManager::CleanResiduesNonAuto(), trimAlManager::print_statistics(), and Alignment::statSVG().
|
private |
Method to check wether or not a window has been applied.
Definition at line 580 of file Consistency.cpp.
References halfWindow.
Referenced by getValues().
bool statistics::Consistency::perform | ( | char * | comparesetFilePath, |
FormatHandling::FormatManager & | formatManager, | ||
trimAlManager & | manager, | ||
char * | forceFile | ||
) |
Method to compare a set of MSA, all containing the same sequences and residues.
The number of residues must be the same, but gaps are not taken into account.
This is due to the same sequence being aligned in different ways, which changes the gap patterns.
comparesetFilePath | Path to the file containing paths for each alignment to compare. One per line |
formatManager | Format manager, to load and save the alignments. |
manager | trimAl manager, to store the choosen alignment in trimAlManager::origAlig |
forceFile | path to file to forcefully select. If nullptr, the most consistent alignment will be selected. |
Definition at line 45 of file Consistency.cpp.
References alig, Alignment::Alignment(), AlignmentTypesNotMatching, trimAlManager::appearErrors, appearErrors, applyWindow(), compareAlignmentsArray, compareAndChoose(), ComparesetFailedAlignmentMissing, statistics::Manager::consistency, trimAlManager::consistencyWindow, trimAlManager::CS, debug, forceComparison(), Alignment::getAlignmentType(), FormatHandling::FormatManager::getFileFormatName(), Alignment::getNumAminos(), i, Alignment::isFileAligned(), FormatHandling::FormatManager::loadAlignment(), maxResidues, NotAligned, NotDefined, numFiles, trimAlManager::oformats, trimAlManager::origAlig, Alignment::originalNumberOfResidues, trimAlManager::outfile, reporting::reportManager::report(), residues, Alignment::SequencesMatrix, Alignment::sequencesMatrix::sequencesMatrix(), Alignment::Statistics, trimAlManager::stats, values, and trimAlManager::windowSize.
Referenced by trimAlManager::performCompareset().
|
static |
Print the accumulated consistency value from the selected alignment.
alig | Alignment used to obtain the accumulated consistency value |
compareVect | Vector containing the consistency value for each column. |
Definition at line 666 of file Consistency.cpp.
References utils::copyVect(), Alignment::filename, Alignment::numberOfResidues, and utils::quicksort().
Referenced by trimAlManager::print_statistics().
|
static |
Print the consistency value for each column from the selected alignment.
alig | Alignment used to obtain the accumulated consistency value |
compareVect | Vector containing the consistency value for each column. |
Definition at line 606 of file Consistency.cpp.
References Alignment::filename, and Alignment::numberOfResidues.
Referenced by trimAlManager::print_statistics().
|
private |
Original alignment for which the stat was calculated.
Definition at line 127 of file Consistency.h.
Referenced by Consistency(), perform(), and ~Consistency().
|
private |
Intermediate variable to keep track of the progress status.
Definition at line 154 of file Consistency.h.
Referenced by perform().
|
private |
Array of alignments to compare.
Definition at line 130 of file Consistency.h.
Referenced by perform().
|
private |
Variable to store the type of the alignment from the last alignment.
Definition at line 146 of file Consistency.h.
Referenced by applyWindow(), getValues(), and isWindowDefined().
|
private |
Temporary variable used on loops.
Definition at line 142 of file Consistency.h.
Referenced by perform().
|
private |
Maximum number of residues on the whole dataset.
Definition at line 144 of file Consistency.h.
Referenced by perform().
|
private |
|
private |
Counter of how many statisticsConsistency share the same MDK values.
Definition at line 151 of file Consistency.h.
Referenced by Consistency(), and ~Consistency().
|
private |
Number of residues of the selected alignment.
Definition at line 148 of file Consistency.h.
Referenced by applyWindow(), and perform().
|
private |
Raw consistency values.
Definition at line 133 of file Consistency.h.
Referenced by applyWindow(), Consistency(), getValues(), perform(), and ~Consistency().
|
private |
Windowed consistency values.
Definition at line 136 of file Consistency.h.
Referenced by applyWindow(), Consistency(), getValues(), and ~Consistency().