#include <recombinator.hpp>
A class that creates synthetic haplotypes from a Haplotypes
representation of local haplotypes.
◆ sequence_type
A GBWT sequence as (sequence identifier, offset in a node).
◆ Verbosity
The amount of progress information that should be printed to stderr.
◆ kmer_presence
Kmer classification.
Enumerator |
---|
absent | |
heterozygous | |
present | |
frequent | |
◆ Recombinator()
vg::Recombinator::Recombinator |
( |
const gbwtgraph::GBZ & |
gbz, |
|
|
const Haplotypes & |
haplotypes, |
|
|
Verbosity |
verbosity |
|
) |
| |
◆ classify_kmers()
std::vector< char > vg::Recombinator::classify_kmers |
( |
const std::string & |
kff_file, |
|
|
const Parameters & |
parameters |
|
) |
| const |
Classifies the kmers used for describing the haplotypes according to their frequency in the KFF file. Uses A
, H
, P
, and F
to represent absent, heterozygous, present, and frequent kmers, respectively.
Throws std::runtime_error
on error.
◆ extract_sequences()
Extracts the local haplotypes in the given subchain. In addition to the haplotype sequence, this also reports the name of the corresponding path as well as (rank, score) for the haplotype in each round of haplotype selection. The number of rounds is parameters.num_haplotypes
, but if the haplotype is selected earlier, it will not get further scores.
Throws std::runtime_error
on error.
◆ generate_haplotypes() [1/2]
◆ generate_haplotypes() [2/2]
gbwt::GBWT vg::Recombinator::generate_haplotypes |
( |
const std::string & |
kff_file, |
|
|
const Parameters & |
parameters |
|
) |
| const |
Generates haplotypes based on the kmer counts in the given KFF file.
Runs multiple GBWT construction jobs in parallel using OpenMP threads and generates the specified number of haplotypes in each top-level chain (component).
Each generated haplotype has a single source haplotype in each subchain. The subchains are connected by unary paths. Suffix / prefix subchains in the middle of a chain create fragment breaks. If the chain starts without a prefix (ends without a suffix), the haplotype chosen for the first (last) subchain is used from the start (continued until the end).
Throws std::runtime_error
on error in single-threaded parts and exits with std::exit(EXIT_FAILURE)
in multi-threaded parts.
◆ ABSENT_SCORE
constexpr double vg::Recombinator::ABSENT_SCORE = 0.8 |
|
staticconstexpr |
Score for getting an absent kmer right/wrong. This should be less than 1, if we assume that having the right variants in the graph is more important than keeping wrong variants out.
◆ COVERAGE
constexpr size_t vg::Recombinator::COVERAGE = 0 |
|
staticconstexpr |
Expected kmer coverage. Use 0 to estimate from kmer counts.
◆ gbz
const gbwtgraph::GBZ& vg::Recombinator::gbz |
◆ haplotypes
◆ HET_ADJUSTMENT
constexpr double vg::Recombinator::HET_ADJUSTMENT = 0.05 |
|
staticconstexpr |
Adjustment to the score of a heterozygous kmer every time a haplotype with (-) or without (+) that kmer is selected.
◆ KFF_BLOCK_SIZE
constexpr size_t vg::Recombinator::KFF_BLOCK_SIZE = 1000000 |
|
staticconstexpr |
Block size (in kmers) for reading KFF files.
◆ NUM_CANDIDATES
constexpr size_t vg::Recombinator::NUM_CANDIDATES = 32 |
|
staticconstexpr |
A reasonable number of candidates for diploid sampling.
◆ NUM_HAPLOTYPES
constexpr size_t vg::Recombinator::NUM_HAPLOTYPES = 4 |
|
staticconstexpr |
Number of haplotypes to be generated.
◆ PRESENT_DISCOUNT
constexpr double vg::Recombinator::PRESENT_DISCOUNT = 0.9 |
|
staticconstexpr |
Multiplier to the score of a present kmer every time a haplotype with that kmer is selected.
◆ verbosity
The documentation for this class was generated from the following files: