vg
tools for working with variation graphs
Classes | Public Types | Public Member Functions | Static Protected Member Functions | Protected Attributes | Private Types | Friends | List of all members
vg::ZipCodeTree Class Reference

#include <zip_code_tree.hpp>

Classes

class  distance_iterator
 
struct  oriented_seed_t
 
class  seed_iterator
 
struct  seed_result_t
 
struct  tree_item_t
 One item in the zip code tree, representing a node or edge of the tree. More...
 

Public Types

enum  tree_item_type_t {
  SEED =0, CHAIN_START, CHAIN_END, EDGE,
  CHAIN_COUNT, SNARL_START, SNARL_END
}
 The type of an item in the zip code tree. More...
 

Public Member Functions

 ZipCodeTree ()
 
size_t get_tree_size () const
 Get the number of items in the tree. More...
 
tree_item_t get_item_at_index (size_t index) const
 Access the value at [index] in the zip_code_tree. More...
 
vector< oriented_seed_tget_all_seeds () const
 
size_t get_offset_to_seed (size_t &i, bool right_to_left) const
 
void add_close_bound (size_t start_index)
 
seed_iterator begin () const
 Get an iterator over indexes of seeds in the tree, left to right. More...
 
seed_iterator end () const
 
distance_iterator find_distances (const seed_iterator &from, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 Get a iterator starting from where a forward iterator is, up to a distance limit. More...
 
void print_self (const vector< Seed > *seeds) const
 
bool node_is_invalid (nid_t id, const SnarlDistanceIndex &distance_index, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 
void validate_zip_tree (const SnarlDistanceIndex &distance_index, const vector< Seed > *seeds, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 
void validate_boundaries (const SnarlDistanceIndex &distance_index, const vector< Seed > *seeds, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 
void validate_zip_tree_order (const SnarlDistanceIndex &distance_index, const vector< Seed > *seeds) const
 
void validate_seed_distances (const SnarlDistanceIndex &distance_index, const vector< Seed > *seeds, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 
void validate_snarl (std::vector< tree_item_t >::const_iterator &zip_iterator, const SnarlDistanceIndex &distance_index, const vector< Seed > *seeds, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 
void validate_chain (std::vector< tree_item_t >::const_iterator &zip_iterator, const SnarlDistanceIndex &distance_index, const vector< Seed > *seeds, size_t distance_limit=std::numeric_limits< size_t >::max()) const
 
std::pair< size_t, size_t > dag_and_cyclic_snarl_count () const
 

Static Protected Member Functions

static bool seed_is_reversed_at_depth (const Seed &seed, size_t depth, const SnarlDistanceIndex &distance_index)
 

Protected Attributes

vector< tree_item_tzip_code_tree
 The actual tree structure. More...
 

Private Types

typedef SnarlDistanceIndexClusterer::Seed Seed
 

Friends

class ZipCodeForest
 

Member Typedef Documentation

◆ Seed

Convenient alias for SnarlDistanceIndexClusterer::Seed Despite the name, these are used for graph positions, so they act more like minimizers

Member Enumeration Documentation

◆ tree_item_type_t

The type of an item in the zip code tree.

Enumerator
SEED 
CHAIN_START 
CHAIN_END 
EDGE 
CHAIN_COUNT 
SNARL_START 
SNARL_END 

Constructor & Destructor Documentation

◆ ZipCodeTree()

vg::ZipCodeTree::ZipCodeTree ( )
inline

Empty constructor ZipCodeTree's get filled in by ZipCodeForest's

Member Function Documentation

◆ add_close_bound()

void vg::ZipCodeTree::add_close_bound ( size_t  start_index)

Add snarl or chain end of matching type and sets up their section_length values

◆ begin()

seed_iterator vg::ZipCodeTree::begin ( ) const
inline

Get an iterator over indexes of seeds in the tree, left to right.

◆ dag_and_cyclic_snarl_count()

std::pair< size_t, size_t > vg::ZipCodeTree::dag_and_cyclic_snarl_count ( ) const

Count the number of snarls involved in the tree Returns a pair of <dag count, cyclic count> Assumes that the tree has already been filled in

◆ end()

seed_iterator vg::ZipCodeTree::end ( ) const
inline

Get the end iterator for seeds in the tree, left to right. (Note that the last element will never be a seed)

◆ find_distances()

auto vg::ZipCodeTree::find_distances ( const seed_iterator from,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Get a iterator starting from where a forward iterator is, up to a distance limit.

◆ get_all_seeds()

vector<oriented_seed_t> vg::ZipCodeTree::get_all_seeds ( ) const
inline

Get all the seeds in the tree, in left-to-right order Also returns their orientations Basically seed_itr but without all the extra baggage

◆ get_item_at_index()

tree_item_t vg::ZipCodeTree::get_item_at_index ( size_t  index) const
inline

Access the value at [index] in the zip_code_tree.

◆ get_offset_to_seed()

size_t vg::ZipCodeTree::get_offset_to_seed ( size_t &  i,
bool  right_to_left 
) const

Helper for add_distance_matrix() Essentially re-calculates chain.distances.first/second for seeds which are inside nested snarls

If the seed is in a nested snarl, this is the distance to snarl edge Otherwise it is 0 Moves i along to find the seed, and returns the offset If right_to_left is true, then search leftward for the last seed Otherwise, search rightward for the first seed

◆ get_tree_size()

size_t vg::ZipCodeTree::get_tree_size ( ) const
inline

Get the number of items in the tree.

◆ node_is_invalid()

bool vg::ZipCodeTree::node_is_invalid ( nid_t  id,
const SnarlDistanceIndex &  distance_index,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Is the given node in a multicomponent chain, looping chain, or anything else that would cause it to not have exact distances? The distances are only guaranteed to be correct up to the distance limit Cyclic snarls don't count as being invalid

◆ print_self()

void vg::ZipCodeTree::print_self ( const vector< Seed > *  seeds) const

Print the zip code tree to stderr ( and ) are used for the starts and ends of DAG snarls { and } are used for the starts and ends of cyclic snarls [ and ] are used for the starts and ends of chains seeds are printed as their positions

◆ seed_is_reversed_at_depth()

bool vg::ZipCodeTree::seed_is_reversed_at_depth ( const Seed seed,
size_t  depth,
const SnarlDistanceIndex &  distance_index 
)
staticprotected

Helper function to get orientation of a snarl tree node at a given depth does the same thing as the zipcode decoder's get_is_reversed_in_parent, except it also considers chains that are children of irregular snarls.

We assume that all snarls are DAGs, so all children of snarls must only be traversable in one orientation through the snarl. This assumption doesn't work for cyclic snarls, but as their chains are traversed in both directions, their storage orientation doesn't matter.

In a start-to-end traversal of a snarl, each node will only be traversable start-to-end or end-to-start. If traversable end-to-start, then it is considered to be oriented backwards in its parent

◆ validate_boundaries()

void vg::ZipCodeTree::validate_boundaries ( const SnarlDistanceIndex &  distance_index,
const vector< Seed > *  seeds,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Helper function for validate_zip_tree() to check snarl/chain boundaries Ensures that all boundaries are matched in type, and that pair indexes are set up correctly Also checks that there is at least one seed in the tree Calls validate_snarl() for each snarl in the top-level chain

◆ validate_chain()

void vg::ZipCodeTree::validate_chain ( std::vector< tree_item_t >::const_iterator &  zip_iterator,
const SnarlDistanceIndex &  distance_index,
const vector< Seed > *  seeds,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Helper function for validate_snarl for a chain zip_iterator is an iterator to the chain start At the end of the function, zip_iterator will be set to the chain end

◆ validate_seed_distances()

void vg::ZipCodeTree::validate_seed_distances ( const SnarlDistanceIndex &  distance_index,
const vector< Seed > *  seeds,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Helper function for validate_zip_tree() to check distance iteration Uses the same iterator logic that the main chaining code does and for each pair of seeds output by the distance_iterator, compares their distance to the distance index

◆ validate_snarl()

void vg::ZipCodeTree::validate_snarl ( std::vector< tree_item_t >::const_iterator &  zip_iterator,
const SnarlDistanceIndex &  distance_index,
const vector< Seed > *  seeds,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Helper function for validate_zip_tree for just a snarl zip_iterator is an iterator to the snarl start At the end of the function, zip_iterator will be set to the snarl end

◆ validate_zip_tree()

void vg::ZipCodeTree::validate_zip_tree ( const SnarlDistanceIndex &  distance_index,
const vector< Seed > *  seeds,
size_t  distance_limit = std::numeric_limits<size_t>::max() 
) const

Check that the tree is correct:

  1. All snarl/chain boundaries are closed properly
  2. The order of the items is logical
  3. The distances between seeds (as output by iteration) are correct

◆ validate_zip_tree_order()

void vg::ZipCodeTree::validate_zip_tree_order ( const SnarlDistanceIndex &  distance_index,
const vector< Seed > *  seeds 
) const

Helper function for validate_zip_tree() to check for a well-formed order

  1. Do seeds have logical orientations relative to each other?
  2. Do chains follow a [child, dist, child, dist, ... child] order?
  3. Are there CHAIN_COUNTs right after each SNARL_START?

Friends And Related Function Documentation

◆ ZipCodeForest

friend class ZipCodeForest
friend

Member Data Documentation

◆ zip_code_tree

vector<tree_item_t> vg::ZipCodeTree::zip_code_tree
protected

The actual tree structure.


The documentation for this class was generated from the following files: