statistics::Consistency Class Reference

Class to calculate the consistency between several MSA containing the same sequences, differently aligned.
Using this statistics, the class is able to select the most consistent alignment between all alignments provided.
It is possible to forcefully select an alignment, but to calculate the statistics for latter use.
After selecting an alignment (most consistent or manually selected), it is possible to use this statistic to trim the alignment, removing columns that are not consistent enough with the other alignments. More...

#include <Consistency.h>

Public Member Functions

bool perform (char *comparesetFilePath, FormatHandling::FormatManager &formatManager, trimAlManager &manager, char *forceFile)
 Method to compare a set of MSA, all containing the same sequences and residues.
The number of residues must be the same, but gaps are not taken into account.
This is due to the same sequence being aligned in different ways, which changes the gap patterns. More...
 
bool applyWindow (int halfW)
 Applies a new window to the alignment. More...
 
 Consistency (Alignment *pAlignment, Consistency *pConsistency)
 Copy constructor. More...
 
 Consistency ()
 Default Construtor. More...
 
 ~Consistency ()
 Default Destructor. More...
 
float * getValues ()
 Stat Getter
. More...
 

Static Public Member Functions

static void printStatisticsFileColumns (Alignment &alig, float *compareVect)
 Print the consistency value for each column from the selected alignment. More...
 
static void printStatisticsFileAcl (Alignment &alig, float *compareVect)
 Print the accumulated consistency value from the selected alignment. More...
 

Private Member Functions

bool isWindowDefined ()
 Method to check wether or not a window has been applied. More...
 

Static Private Member Functions

static int compareAndChoose (Alignment **vectAlignments, char **fileNames, float *columnsValue, int numAlignments, bool verbosity)
 Method to compare a set of alignments to select the most consistent one respect the others.
To compute the consistency values we use the proportion of residue pairs per columns in the alignments to compare. More...
 
static bool forceComparison (Alignment **vectAlignments, int numAlignments, Alignment *selected, float *columnsValue)
 Method to obtain the consistency values vector for a given alignment against a set of alignments with the same sequences. More...
 

Private Attributes

Alignmentalig = nullptr
 Original alignment for which the stat was calculated. More...
 
Alignment ** compareAlignmentsArray = nullptr
 Array of alignments to compare. More...
 
float * values = nullptr
 Raw consistency values. More...
 
float * values_windowed = nullptr
 Windowed consistency values. More...
 
int numFiles = 0
 Number of files to compare. More...
 
int i = 0
 Temporary variable used on loops. More...
 
int maxResidues = 0
 Maximum number of residues on the whole dataset. More...
 
int halfWindow = -1
 Variable to store the type of the alignment from the last alignment. More...
 
int residues = -1
 Number of residues of the selected alignment. More...
 
int * refCounter
 Counter of how many statisticsConsistency share the same MDK values. More...
 
bool appearErrors = false
 Intermediate variable to keep track of the progress status. More...
 

Detailed Description

Class to calculate the consistency between several MSA containing the same sequences, differently aligned.
Using this statistics, the class is able to select the most consistent alignment between all alignments provided.
It is possible to forcefully select an alignment, but to calculate the statistics for latter use.
After selecting an alignment (most consistent or manually selected), it is possible to use this statistic to trim the alignment, removing columns that are not consistent enough with the other alignments.

Definition at line 58 of file Consistency.h.

Constructor & Destructor Documentation

◆ Consistency() [1/2]

statistics::Consistency::Consistency ( Alignment pAlignment,
Consistency pConsistency 
)

Copy constructor.

Definition at line 829 of file Consistency.cpp.

References alig, refCounter, values, and values_windowed.

Referenced by statistics::Manager::Manager().

+ Here is the caller graph for this function:

◆ Consistency() [2/2]

statistics::Consistency::Consistency ( )

Default Construtor.

Definition at line 839 of file Consistency.cpp.

References refCounter.

Referenced by trimAlManager::performCompareset().

+ Here is the caller graph for this function:

◆ ~Consistency()

statistics::Consistency::~Consistency ( )

Default Destructor.

Definition at line 820 of file Consistency.cpp.

References alig, refCounter, values, and values_windowed.

Member Function Documentation

◆ applyWindow()

bool statistics::Consistency::applyWindow ( int  halfW)

Applies a new window to the alignment.

Parameters
halfWHalf size of window to apply.
Returns
True if correct Exits if false

Definition at line 525 of file Consistency.cpp.

References ConsistencyWindowTooBig, debug, halfWindow, reporting::reportManager::report(), residues, values, and values_windowed.

Referenced by getValues(), and perform().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ compareAndChoose()

int statistics::Consistency::compareAndChoose ( Alignment **  vectAlignments,
char **  fileNames,
float *  columnsValue,
int  numAlignments,
bool  verbosity 
)
staticprivate

Method to compare a set of alignments to select the most consistent one respect the others.
To compute the consistency values we use the proportion of residue pairs per columns in the alignments to compare.

Parameters
vectAlignmentsAlignment vector to compare and select the most consistent.
fileNamesVector containing all the filenames. Useful only if verbosity==True.
[out]columnsValueConsistency values of selected alignment.
numAlignmentsNumber of alignments to compare.
verbosityWether or not report by printing some results.
Returns
-1 if there was any error.
Alignment index of the selected algorithm otherwise.

Definition at line 241 of file Consistency.cpp.

References debug, DifferentNumberOfSequencesInCompareset, DifferentSeqsNamesInCompareset, Alignment::sequencesMatrix::getColumn(), Alignment::getNumAminos(), Alignment::getNumSpecies(), Alignment::getSequenceNameOrder(), Alignment::getSequences(), utils::initlVect(), reporting::reportManager::report(), Alignment::SequencesMatrix, and Alignment::sequencesMatrix::setOrder().

Referenced by perform().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ forceComparison()

bool statistics::Consistency::forceComparison ( Alignment **  vectAlignments,
int  numAlignments,
Alignment selected,
float *  columnsValue 
)
staticprivate

Method to obtain the consistency values vector for a given alignment against a set of alignments with the same sequences.

Parameters
vectAlignmentsAlignment vector to compare against the selected alignment
numAlignmentsNumber of alignments to compare
selectedAlignment to compare against the set of alignments.
[out]columnsValueVector to fill with the consistency values.
Returns
Wether or not the method went ok.

Definition at line 416 of file Consistency.cpp.

References debug, DifferentNumberOfSequencesInCompareset, DifferentSeqsNamesInCompareset, Alignment::sequencesMatrix::getColumn(), Alignment::getNumAminos(), Alignment::getNumSpecies(), Alignment::getSequenceNameOrder(), Alignment::getSequences(), utils::initlVect(), reporting::reportManager::report(), Alignment::SequencesMatrix, Alignment::sequencesMatrix::sequencesMatrix(), and Alignment::sequencesMatrix::setOrder().

Referenced by perform().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ getValues()

float * statistics::Consistency::getValues ( )

Stat Getter
.

Returns
If a window has been set and applied, it will return the windowed values If a window has been set, but not applied, it wil apply it and return the windowed values. If no window has been set, it will return the raw values.

Definition at line 588 of file Consistency.cpp.

References applyWindow(), halfWindow, isWindowDefined(), values, and values_windowed.

Referenced by Alignment::alignmentSummaryHTML(), Alignment::alignmentSummarySVG(), trimAlManager::CleanResiduesNonAuto(), trimAlManager::print_statistics(), and Alignment::statSVG().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ isWindowDefined()

bool statistics::Consistency::isWindowDefined ( )
private

Method to check wether or not a window has been applied.

Returns
True halfWindow is different than -1

Definition at line 580 of file Consistency.cpp.

References halfWindow.

Referenced by getValues().

+ Here is the caller graph for this function:

◆ perform()

bool statistics::Consistency::perform ( char *  comparesetFilePath,
FormatHandling::FormatManager formatManager,
trimAlManager manager,
char *  forceFile 
)

Method to compare a set of MSA, all containing the same sequences and residues.
The number of residues must be the same, but gaps are not taken into account.
This is due to the same sequence being aligned in different ways, which changes the gap patterns.

Parameters
comparesetFilePathPath to the file containing paths for each alignment to compare.
One per line
formatManagerFormat manager, to load and save the alignments.
managertrimAl manager, to store the choosen alignment in trimAlManager::origAlig
forceFilepath to file to forcefully select. If nullptr, the most consistent alignment will be selected.
Note
The method does not return anything, as the alignment is stored on trimAlManager::origAlig It also stores the current instance of statisticsConsistency into the consistent alignment. This allows us to use this information to trim, or represent it on the HTML/SVG reports.

Definition at line 45 of file Consistency.cpp.

References alig, Alignment::Alignment(), AlignmentTypesNotMatching, trimAlManager::appearErrors, appearErrors, applyWindow(), compareAlignmentsArray, compareAndChoose(), ComparesetFailedAlignmentMissing, statistics::Manager::consistency, trimAlManager::consistencyWindow, trimAlManager::CS, debug, forceComparison(), Alignment::getAlignmentType(), FormatHandling::FormatManager::getFileFormatName(), Alignment::getNumAminos(), i, Alignment::isFileAligned(), FormatHandling::FormatManager::loadAlignment(), maxResidues, NotAligned, NotDefined, numFiles, trimAlManager::oformats, trimAlManager::origAlig, Alignment::originalNumberOfResidues, trimAlManager::outfile, reporting::reportManager::report(), residues, Alignment::SequencesMatrix, Alignment::sequencesMatrix::sequencesMatrix(), Alignment::Statistics, trimAlManager::stats, values, and trimAlManager::windowSize.

Referenced by trimAlManager::performCompareset().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ printStatisticsFileAcl()

void statistics::Consistency::printStatisticsFileAcl ( Alignment alig,
float *  compareVect 
)
static

Print the accumulated consistency value from the selected alignment.

Parameters
aligAlignment used to obtain the accumulated consistency value
compareVectVector containing the consistency value for each column.

Definition at line 666 of file Consistency.cpp.

References utils::copyVect(), Alignment::filename, Alignment::numberOfResidues, and utils::quicksort().

Referenced by trimAlManager::print_statistics().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ printStatisticsFileColumns()

void statistics::Consistency::printStatisticsFileColumns ( Alignment alig,
float *  compareVect 
)
static

Print the consistency value for each column from the selected alignment.

Parameters
aligAlignment used to obtain the accumulated consistency value
compareVectVector containing the consistency value for each column.

Definition at line 606 of file Consistency.cpp.

References Alignment::filename, and Alignment::numberOfResidues.

Referenced by trimAlManager::print_statistics().

+ Here is the caller graph for this function:

Member Data Documentation

◆ alig

Alignment* statistics::Consistency::alig = nullptr
private

Original alignment for which the stat was calculated.

Definition at line 127 of file Consistency.h.

Referenced by Consistency(), perform(), and ~Consistency().

◆ appearErrors

bool statistics::Consistency::appearErrors = false
private

Intermediate variable to keep track of the progress status.

Definition at line 154 of file Consistency.h.

Referenced by perform().

◆ compareAlignmentsArray

Alignment** statistics::Consistency::compareAlignmentsArray = nullptr
private

Array of alignments to compare.

Definition at line 130 of file Consistency.h.

Referenced by perform().

◆ halfWindow

int statistics::Consistency::halfWindow = -1
private

Variable to store the type of the alignment from the last alignment.

Definition at line 146 of file Consistency.h.

Referenced by applyWindow(), getValues(), and isWindowDefined().

◆ i

int statistics::Consistency::i = 0
private

Temporary variable used on loops.

Definition at line 142 of file Consistency.h.

Referenced by perform().

◆ maxResidues

int statistics::Consistency::maxResidues = 0
private

Maximum number of residues on the whole dataset.

Definition at line 144 of file Consistency.h.

Referenced by perform().

◆ numFiles

int statistics::Consistency::numFiles = 0
private

Number of files to compare.

Definition at line 140 of file Consistency.h.

Referenced by perform().

◆ refCounter

int* statistics::Consistency::refCounter
private

Counter of how many statisticsConsistency share the same MDK values.

Definition at line 151 of file Consistency.h.

Referenced by Consistency(), and ~Consistency().

◆ residues

int statistics::Consistency::residues = -1
private

Number of residues of the selected alignment.

Definition at line 148 of file Consistency.h.

Referenced by applyWindow(), and perform().

◆ values

float* statistics::Consistency::values = nullptr
private

Raw consistency values.

Definition at line 133 of file Consistency.h.

Referenced by applyWindow(), Consistency(), getValues(), perform(), and ~Consistency().

◆ values_windowed

float* statistics::Consistency::values_windowed = nullptr
private

Windowed consistency values.

Definition at line 136 of file Consistency.h.

Referenced by applyWindow(), Consistency(), getValues(), and ~Consistency().


The documentation for this class was generated from the following files: