vg
tools for working with variation graphs
Classes | Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
vg::PoissonSupportSnarlCaller Class Reference

#include <snarl_caller.hpp>

Inheritance diagram for vg::PoissonSupportSnarlCaller:
vg::SupportBasedSnarlCaller vg::SnarlCaller

Classes

struct  PoissonCallInfo
 

Public Member Functions

 PoissonSupportSnarlCaller (const PathHandleGraph &graph, SnarlManager &snarl_manager, TraversalSupportFinder &support_finder, const algorithms::BinnedDepthIndex &depth_index, bool use_mapq)
 
virtual ~PoissonSupportSnarlCaller ()
 
void set_baseline_error (double small_variant_error, double large_variant_error)
 Set some parameters. More...
 
void set_insertion_bias (double insertion_threshold, double small_insertion_bias, double large_insertion_bias)
 These are multipliers applied to the errors if the site has an insertion. More...
 
virtual pair< vector< int >, unique_ptr< CallInfo > > genotype (const Snarl &snarl, const vector< SnarlTraversal > &traversals, int ref_trav_idx, int ploidy, const string &ref_path_name, pair< size_t, size_t > ref_range)
 Get the genotype of a site. More...
 
virtual void update_vcf_info (const Snarl &snarl, const vector< SnarlTraversal > &traversals, const vector< int > &genotype, const unique_ptr< CallInfo > &call_info, const string &sample_name, vcflib::Variant &variant)
 Update INFO and FORMAT fields of the called variant. More...
 
virtual void update_vcf_header (string &header) const
 Define any header fields needed by the above. More...
 
- Public Member Functions inherited from vg::SupportBasedSnarlCaller
 SupportBasedSnarlCaller (const PathHandleGraph &graph, SnarlManager &snarl_manager, TraversalSupportFinder &support_finder)
 
virtual ~SupportBasedSnarlCaller ()
 
void set_min_supports (double min_mad_for_call, double min_support_for_call, double min_site_support)
 Set some of the parameters. More...
 
TraversalSupportFinderget_support_finder () const
 Get the traversal support finder. More...
 
virtual int get_min_total_support_for_call () const
 Get the minimum total support for call. More...
 
virtual function< bool(const SnarlTraversal &, int iteration)> get_skip_allele_fn () const
 Use min_alt_path_support threshold as cutoff. More...
 
- Public Member Functions inherited from vg::SnarlCaller
virtual ~SnarlCaller ()
 

Protected Member Functions

double genotype_likelihood (const vector< int > &genotype, const vector< SnarlTraversal > &traversals, const set< int > &trav_subset, const vector< int > &traversal_sizes, const vector< double > &traversal_mapqs, int ref_trav_idx, double exp_depth, double depth_err, int max_trav_size, int ref_trav_size)
 
vector< int > rank_by_support (const vector< Support > &supports)
 Rank supports. More...
 

Protected Attributes

double baseline_error_small = 0.005
 Baseline error rate for smaller variants. More...
 
double baseline_error_large = 0.01
 Baseline error rate for larger variants. More...
 
double insertion_bias_large = 1.
 
double insertion_bias_small = 1.
 
double insertion_threshold = 5.
 
size_t top_k = 20
 Consider up to the top-k traversals (based on support) for genotyping. More...
 
size_t top_m = 100
 
double depth_padding_factor = 1.
 padding to apply wrt to longest traversal to snarl ranges when looking up binned depth More...
 
const algorithms::BinnedDepthIndexdepth_index
 Map path name to <mean, std_err> of depth coverage from the packer. More...
 
bool use_mapq
 MAPQ information is available from the packer and we want to use it. More...
 
- Protected Attributes inherited from vg::SupportBasedSnarlCaller
const PathHandleGraphgraph
 
SnarlManagersnarl_manager
 
TraversalSupportFindersupport_finder
 Get support from traversals. More...
 
int min_total_support_for_call = 2
 
size_t min_mad_for_filter = 1
 
size_t min_site_depth = 4
 
double min_alt_path_support = 0.5
 

Additional Inherited Members

- Static Protected Member Functions inherited from vg::SupportBasedSnarlCaller
static int get_best_support (const vector< Support > &supports, const vector< int > &skips)
 Get the best support out of a list of supports, ignoring skips. More...
 
static double support_val (const Support &support)
 Relic from old code. More...
 

Detailed Description

Find the genotype of some traversals in a site using read support and a Poisson model based on expected depth. Inspired, in part, by Paragraph, which uses a similar approach for genotyping break points

Constructor & Destructor Documentation

◆ PoissonSupportSnarlCaller()

vg::PoissonSupportSnarlCaller::PoissonSupportSnarlCaller ( const PathHandleGraph graph,
SnarlManager snarl_manager,
TraversalSupportFinder support_finder,
const algorithms::BinnedDepthIndex depth_index,
bool  use_mapq 
)

◆ ~PoissonSupportSnarlCaller()

vg::PoissonSupportSnarlCaller::~PoissonSupportSnarlCaller ( )
virtual

Member Function Documentation

◆ genotype()

pair< vector< int >, unique_ptr< SnarlCaller::CallInfo > > vg::PoissonSupportSnarlCaller::genotype ( const Snarl snarl,
const vector< SnarlTraversal > &  traversals,
int  ref_trav_idx,
int  ploidy,
const string &  ref_path_name,
pair< size_t, size_t >  ref_range 
)
virtual

Get the genotype of a site.

Implements vg::SnarlCaller.

◆ genotype_likelihood()

double vg::PoissonSupportSnarlCaller::genotype_likelihood ( const vector< int > &  genotype,
const vector< SnarlTraversal > &  traversals,
const set< int > &  trav_subset,
const vector< int > &  traversal_sizes,
const vector< double > &  traversal_mapqs,
int  ref_trav_idx,
double  exp_depth,
double  depth_err,
int  max_trav_size,
int  ref_trav_size 
)
protected

Compute likelihood of genotype as product of poisson probabilities P[allele1] * P[allle2] * P[uncalled alleles] Homozygous alleles are split into two, with half support each The (natural) logoarithm is returned If trav_subset is not empty, traversals outside that set (and genotype) will be ignored to save time

◆ rank_by_support()

vector< int > vg::PoissonSupportSnarlCaller::rank_by_support ( const vector< Support > &  supports)
protected

Rank supports.

◆ set_baseline_error()

void vg::PoissonSupportSnarlCaller::set_baseline_error ( double  small_variant_error,
double  large_variant_error 
)

Set some parameters.

◆ set_insertion_bias()

void vg::PoissonSupportSnarlCaller::set_insertion_bias ( double  insertion_threshold,
double  small_insertion_bias,
double  large_insertion_bias 
)

These are multipliers applied to the errors if the site has an insertion.

◆ update_vcf_header()

void vg::PoissonSupportSnarlCaller::update_vcf_header ( string &  header) const
virtual

Define any header fields needed by the above.

Implements vg::SnarlCaller.

◆ update_vcf_info()

void vg::PoissonSupportSnarlCaller::update_vcf_info ( const Snarl snarl,
const vector< SnarlTraversal > &  traversals,
const vector< int > &  genotype,
const unique_ptr< CallInfo > &  call_info,
const string &  sample_name,
vcflib::Variant &  variant 
)
virtual

Update INFO and FORMAT fields of the called variant.

Reimplemented from vg::SupportBasedSnarlCaller.

Member Data Documentation

◆ baseline_error_large

double vg::PoissonSupportSnarlCaller::baseline_error_large = 0.01
protected

Baseline error rate for larger variants.

◆ baseline_error_small

double vg::PoissonSupportSnarlCaller::baseline_error_small = 0.005
protected

Baseline error rate for smaller variants.

Error rates are different for small and large variants, which depend more on base and mapping qualities respectively. The switch threshold is in TraversalSupportFinder. Error stats from the Packer object get added to these baselines when computing the scores.

◆ depth_index

const algorithms::BinnedDepthIndex& vg::PoissonSupportSnarlCaller::depth_index
protected

Map path name to <mean, std_err> of depth coverage from the packer.

◆ depth_padding_factor

double vg::PoissonSupportSnarlCaller::depth_padding_factor = 1.
protected

padding to apply wrt to longest traversal to snarl ranges when looking up binned depth

◆ insertion_bias_large

double vg::PoissonSupportSnarlCaller::insertion_bias_large = 1.
protected

multiply error by this much in pressence of insertion (after some testing, this does not in fact seem to help much in practice. best just to boost overall error above. hence not in CLI and off by default)

◆ insertion_bias_small

double vg::PoissonSupportSnarlCaller::insertion_bias_small = 1.
protected

◆ insertion_threshold

double vg::PoissonSupportSnarlCaller::insertion_threshold = 5.
protected

a site is an insertion if one (supported)allele is this many times bigger than another unlike above, default comes from call_main.cpp (todo: straighten this out?)

◆ top_k

size_t vg::PoissonSupportSnarlCaller::top_k = 20
protected

Consider up to the top-k traversals (based on support) for genotyping.

◆ top_m

size_t vg::PoissonSupportSnarlCaller::top_m = 100
protected

Consider up to the tom-m secondary traversals (based on support) for each top traversal (so at most top_k * top_m considered)

◆ use_mapq

bool vg::PoissonSupportSnarlCaller::use_mapq
protected

MAPQ information is available from the packer and we want to use it.


The documentation for this class was generated from the following files: