vg
tools for working with variation graphs
|
#include <recombinator.hpp>
Classes | |
struct | Header |
Header of the serialized file. More... | |
struct | Subchain |
Representation of a subchain. More... | |
struct | TopLevelChain |
Representation of a top-level chain. More... | |
Public Types | |
enum | Verbosity : size_t { verbosity_silent = 0, verbosity_basic = 1, verbosity_detailed = 2, verbosity_debug = 3 } |
The amount of progress information that should be printed to stderr. More... | |
typedef std::pair< gbwt::size_type, gbwt::size_type > | sequence_type |
A GBWT sequence as (sequence identifier, offset in a node). More... | |
Public Member Functions | |
size_t | components () const |
Returns the number of weakly connected components. More... | |
size_t | jobs () const |
Returns the number of GBWT construction jobs. More... | |
size_t | k () const |
Returns the length of the kmers. More... | |
size_t | kmers () const |
Returns the number of kmers in the subchains. More... | |
hash_map< Subchain::kmer_type, size_t > | kmer_counts (const std::string &kff_file, Verbosity verbosity) const |
void | simple_sds_serialize (std::ostream &out) const |
Serializes the object to a stream in the simple-sds format. More... | |
void | simple_sds_load (std::istream &in) |
Loads the object from a stream in the simple-sds format. More... | |
size_t | simple_sds_size () const |
Returns the size of the object in elements. More... | |
Public Attributes | |
Header | header |
std::vector< size_t > | jobs_for_cached_paths |
std::vector< TopLevelChain > | chains |
A representation of the haplotypes in a graph.
The graph is partitioned into top-level chains, which are further partitioned into subchains. Each subchain contains a set of kmers and a collection of sequences. Each sequence is defined by a bitvector marking the kmers that are present.
At the moment, the kmers are minimizers with a single occurrence in the graph. The requirement is that each kmer is specific to a single subchain and does not occur anywhere else in either orientation. (If no haplotype crosses a snarl, that snarl is broken into a suffix and a prefix, and those subchains may share kmers.)
NOTE: This assumes that the top-level chains are linear, not cyclical.
Versions:
typedef std::pair<gbwt::size_type, gbwt::size_type> vg::Haplotypes::sequence_type |
A GBWT sequence as (sequence identifier, offset in a node).
enum vg::Haplotypes::Verbosity : size_t |
|
inline |
Returns the number of weakly connected components.
|
inline |
Returns the number of GBWT construction jobs.
|
inline |
Returns the length of the kmers.
hash_map< Haplotypes::Subchain::kmer_type, size_t > vg::Haplotypes::kmer_counts | ( | const std::string & | kff_file, |
Verbosity | verbosity | ||
) | const |
Returns a mapping from kmers to their counts in the given KFF file. The counts include both the kmer and the reverse complement.
Reads the KFF file using OpenMP threads. Exits with std::exit()
if the file cannot be opened and throws std::runtime_error
if the kmer counts cannot be used.
|
inline |
Returns the number of kmers in the subchains.
void vg::Haplotypes::simple_sds_load | ( | std::istream & | in | ) |
Loads the object from a stream in the simple-sds format.
void vg::Haplotypes::simple_sds_serialize | ( | std::ostream & | out | ) | const |
Serializes the object to a stream in the simple-sds format.
size_t vg::Haplotypes::simple_sds_size | ( | ) | const |
Returns the size of the object in elements.
std::vector<TopLevelChain> vg::Haplotypes::chains |
Header vg::Haplotypes::header |
std::vector<size_t> vg::Haplotypes::jobs_for_cached_paths |