vg
tools for working with variation graphs
Classes | Public Types | Public Member Functions | Public Attributes | List of all members
vg::Haplotypes Class Reference

#include <recombinator.hpp>

Classes

struct  Header
 Header of the serialized file. More...
 
struct  Subchain
 Representation of a subchain. More...
 
struct  TopLevelChain
 Representation of a top-level chain. More...
 

Public Types

enum  Verbosity : size_t { verbosity_silent = 0, verbosity_basic = 1, verbosity_detailed = 2, verbosity_debug = 3 }
 The amount of progress information that should be printed to stderr. More...
 
typedef std::pair< gbwt::size_type, gbwt::size_type > sequence_type
 A GBWT sequence as (sequence identifier, offset in a node). More...
 

Public Member Functions

size_t components () const
 Returns the number of weakly connected components. More...
 
size_t jobs () const
 Returns the number of GBWT construction jobs. More...
 
size_t k () const
 Returns the length of the kmers. More...
 
size_t kmers () const
 Returns the number of kmers in the subchains. More...
 
hash_map< Subchain::kmer_type, size_t > kmer_counts (const std::string &kff_file, Verbosity verbosity) const
 
void simple_sds_serialize (std::ostream &out) const
 Serializes the object to a stream in the simple-sds format. More...
 
void simple_sds_load (std::istream &in)
 Loads the object from a stream in the simple-sds format. More...
 
size_t simple_sds_size () const
 Returns the size of the object in elements. More...
 

Public Attributes

Header header
 
std::vector< size_t > jobs_for_cached_paths
 
std::vector< TopLevelChainchains
 

Detailed Description

A representation of the haplotypes in a graph.

The graph is partitioned into top-level chains, which are further partitioned into subchains. Each subchain contains a set of kmers and a collection of sequences. Each sequence is defined by a bitvector marking the kmers that are present.

At the moment, the kmers are minimizers with a single occurrence in the graph. The requirement is that each kmer is specific to a single subchain and does not occur anywhere else in either orientation. (If no haplotype crosses a snarl, that snarl is broken into a suffix and a prefix, and those subchains may share kmers.)

NOTE: This assumes that the top-level chains are linear, not cyclical.

Versions:

Member Typedef Documentation

◆ sequence_type

typedef std::pair<gbwt::size_type, gbwt::size_type> vg::Haplotypes::sequence_type

A GBWT sequence as (sequence identifier, offset in a node).

Member Enumeration Documentation

◆ Verbosity

The amount of progress information that should be printed to stderr.

Enumerator
verbosity_silent 

No progress information.

verbosity_basic 

Basic information.

verbosity_detailed 

Basic information and detailed statistics.

verbosity_debug 

Basic information, detailed statistics, and debug information.

Member Function Documentation

◆ components()

size_t vg::Haplotypes::components ( ) const
inline

Returns the number of weakly connected components.

◆ jobs()

size_t vg::Haplotypes::jobs ( ) const
inline

Returns the number of GBWT construction jobs.

◆ k()

size_t vg::Haplotypes::k ( ) const
inline

Returns the length of the kmers.

◆ kmer_counts()

hash_map< Haplotypes::Subchain::kmer_type, size_t > vg::Haplotypes::kmer_counts ( const std::string &  kff_file,
Verbosity  verbosity 
) const

Returns a mapping from kmers to their counts in the given KFF file. The counts include both the kmer and the reverse complement.

Reads the KFF file using OpenMP threads. Exits with std::exit() if the file cannot be opened and throws std::runtime_error if the kmer counts cannot be used.

◆ kmers()

size_t vg::Haplotypes::kmers ( ) const
inline

Returns the number of kmers in the subchains.

◆ simple_sds_load()

void vg::Haplotypes::simple_sds_load ( std::istream &  in)

Loads the object from a stream in the simple-sds format.

◆ simple_sds_serialize()

void vg::Haplotypes::simple_sds_serialize ( std::ostream &  out) const

Serializes the object to a stream in the simple-sds format.

◆ simple_sds_size()

size_t vg::Haplotypes::simple_sds_size ( ) const

Returns the size of the object in elements.

Member Data Documentation

◆ chains

std::vector<TopLevelChain> vg::Haplotypes::chains

◆ header

Header vg::Haplotypes::header

◆ jobs_for_cached_paths

std::vector<size_t> vg::Haplotypes::jobs_for_cached_paths

The documentation for this class was generated from the following files: