vg
tools for working with variation graphs
|
#include <vcf_buffer.hpp>
Public Member Functions | |
WindowedVcfBuffer (vcflib::VariantCallFile *file, size_t window_size) | |
bool | next () |
tuple< vector< vcflib::Variant * >, vcflib::Variant *, vector< vcflib::Variant * > > | get () |
tuple< vector< vcflib::Variant * >, vcflib::Variant *, vector< vcflib::Variant * > > | get_nonoverlapping () |
const vector< vector< int > > & | get_parsed_genotypes (vcflib::Variant *variant) |
bool | has_tabix () const |
bool | set_region (const string &contig, int64_t start=-1, int64_t end=-1) |
Static Protected Member Functions | |
static vector< int > | decompose_genotype_fast (const string &genotype) |
Protected Attributes | |
VcfBuffer | reader |
size_t | window_size |
list< unique_ptr< vcflib::Variant > > | variants_before |
list< unique_ptr< vcflib::Variant > > | variants_after |
unique_ptr< vcflib::Variant > | current |
map< vcflib::Variant *, vector< vector< int > > > | cached_genotypes |
vector< size_t > | map_order_to_original |
Private Member Functions | |
WindowedVcfBuffer (const WindowedVcfBuffer &other)=delete | |
WindowedVcfBuffer & | operator= (const WindowedVcfBuffer &other)=delete |
Provides a look-around buffer for VCFs where you can look at each variant in the context of nearby variants.
Also caches parsings of genotypes, so you can iterate over genotypes efficiently without parsing them out over and over again.
vg::WindowedVcfBuffer::WindowedVcfBuffer | ( | vcflib::VariantCallFile * | file, |
size_t | window_size | ||
) |
Make a new WindowedVcfBuffer buffering the file at the given pointer (which must outlive the buffer, but which may be null). The VCF in the file must be sorted, but may contain overlapping variants.
|
privatedelete |
|
staticprotected |
Quickly decompose a genotype without any string copies.
tuple< vector< vcflib::Variant * >, vcflib::Variant *, vector< vcflib::Variant * > > vg::WindowedVcfBuffer::get | ( | ) |
Get the current variant in its context. Throws an exception if no variant is current. Returns a vector of variants in the window before the current variant, the current variant, and a vector of variants in the window after the current variant.
Pointers will be invalidated upon the next call to next() or set_region().
tuple< vector< vcflib::Variant * >, vcflib::Variant *, vector< vcflib::Variant * > > vg::WindowedVcfBuffer::get_nonoverlapping | ( | ) |
Like get(), but elides variants in the context that overlap the current variant, or each other.
const vector< vector< int > > & vg::WindowedVcfBuffer::get_parsed_genotypes | ( | vcflib::Variant * | variant | ) |
Given a pointer to a cached variant owned by this WindowedVcfBuffer (such as might be obtained from get()), return the cached parsed-out genotypes for all the samples, in the order the samples appear in the VCF file.
Returns a reference which is valid until the variant passed in is scrolled out of the buffer.
bool vg::WindowedVcfBuffer::has_tabix | ( | ) | const |
This returns true if we have a tabix index, and false otherwise. If this is false, set_region may be called, but will do nothing and return false.
bool vg::WindowedVcfBuffer::next | ( | ) |
Advance to the next variant, making it the current variant. Returns true if a next variant exists, and false if no next variant can be found. Must be called (and return true) before the first call to get() after constructing the WindowedVcfBuffer or setting the region.
|
privatedelete |
bool vg::WindowedVcfBuffer::set_region | ( | const string & | contig, |
int64_t | start = -1 , |
||
int64_t | end = -1 |
||
) |
This tries to set the region on the underlying vcflib VariantCallFile to the given contig and region, if specified. Coordinates coming in should be 0-based, and will be converted to 1-based internally.
Returns true if the region was successfully set, and false otherwise (for example, if there is not tabix index, or if the given region is not part of this VCF. Note that if there is a tabix index, and set_region returns false, the position in the VCF file is undefined until the next successful set_region call.
If either of start and end are specified, then both of start and end must be specified.
Discards any variants previously in the buffer.
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |