|
vg
tools for working with variation graphs
|
#include <gbwt_extender.hpp>
Classes | |
| struct | ErrorModel |
Public Member Functions | |
| WFAExtender () | |
| Create an empty WFAExtender. More... | |
| WFAExtender (const gbwtgraph::GBWTGraph &graph, const Aligner &aligner, const ErrorModel &error_model=default_error_model) | |
| WFAAlignment | connect (std::string sequence, pos_t from, pos_t to) const |
| WFAAlignment | suffix (const std::string &sequence, pos_t from) const |
| WFAAlignment | prefix (const std::string &sequence, pos_t to) const |
Public Attributes | |
| const gbwtgraph::GBWTGraph * | graph |
| ReadMasker | mask |
| const Aligner * | aligner |
| const ErrorModel * | error_model |
Static Public Attributes | |
| static const ErrorModel | default_error_model |
| If not specified, we use this default error model. More... | |
A class that supports haplotype-consistent seed extension in a GBWTGraph using the WFA algorithm:
Marco-Sola, Moure, Moreto, Espinosa: Fast gap-affine pairwise alignment using the wavefront algorithm. Bioinformatics, 2021.
The algorithm either tries to connect two seeds or extends a seed to the start/end of the read.
WFAExtender also needs an Aligner object for scoring the extension candidates. While VG wants to maximize a four-parameter alignment score, WFA minimizes a three-parameter score. We use the conversion between the parameters from:
Eizenga, Paten: Improving the time and space complexity of the WFA algorithm and generalizing its scoring. bioRxiv, 2022.
VG scores a gap of length n as gap_open + (n - 1) * gap_extend, while WFA papers use gap_open + n * gap_extend. Hence we use gap_open - gap_extend as the effective four-parameter gap open score inside the aligner.
NOTE: Most internal arithmetic operations use 32-bit integers.
| vg::WFAExtender::WFAExtender | ( | ) |
Create an empty WFAExtender.
| vg::WFAExtender::WFAExtender | ( | const gbwtgraph::GBWTGraph & | graph, |
| const Aligner & | aligner, | ||
| const ErrorModel & | error_model = default_error_model |
||
| ) |
Create a WFAExtender using the given GBWTGraph and Aligner objects. If an error model is passed, use that instead of the default error model. All arguments must outlive the WFAExtender.
| WFAAlignment vg::WFAExtender::connect | ( | std::string | sequence, |
| pos_t | from, | ||
| pos_t | to | ||
| ) | const |
Align the sequence to a haplotype between the two graph positions.
The endpoints are assumed to be valid graph positions. In order for there to be an alignment, there must be a haplotype that includes the endpoints and connects them. However, the endpoints are not covered by the returned alignment.
The sequence that will be aligned is passed by value. All non-ACGT characters are masked with character X, which should not match any character in the graph.
Returns a failed alignment if there is no alignment with an acceptable score.
NOTE: The alignment is to a path after from and before to. If the points are identical, such a path can only exist if there is a cycle.
| WFAAlignment vg::WFAExtender::prefix | ( | const std::string & | sequence, |
| pos_t | to | ||
| ) | const |
A special case of connect() for aligning the sequence to a haplotype ending at the given position. If there is no alignment for the entire sequence with an acceptable score, returns the highest-scoring partial alignment, which may be empty.
Applies the full-length bonus if the result begins with a match or mismatch. TODO: Use the full-length bonus to determine the optimal alignment.
NOTE: This creates a prefix of the full alignment by aligning a suffix of the sequence.
| WFAAlignment vg::WFAExtender::suffix | ( | const std::string & | sequence, |
| pos_t | from | ||
| ) | const |
A special case of connect() for aligning the sequence to a haplotype starting at the given position. If there is no alignment for the entire sequence with an acceptable score, returns the highest-scoring partial alignment, which may be empty.
Applies the full-length bonus if the result ends with a match or mismatch. TODO: Use the full-length bonus to determine the optimal alignment.
NOTE: This creates a suffix of the full alignment by aligning a prefix of the sequence.
| const Aligner* vg::WFAExtender::aligner |
|
static |
If not specified, we use this default error model.
| const ErrorModel* vg::WFAExtender::error_model |
| const gbwtgraph::GBWTGraph* vg::WFAExtender::graph |
| ReadMasker vg::WFAExtender::mask |
1.8.17