vg
tools for working with variation graphs
Public Member Functions | Public Attributes | List of all members
vg::ZipCodeForest::forest_growing_state_t Struct Reference

This stores information about the state of the forest as we fill it in. More...

Public Member Functions

 forest_growing_state_t (const vector< Seed > &seeds, const SnarlDistanceIndex &distance_index, size_t distance_limit)
 Constructor given seeds and a distance index. More...
 

Public Attributes

const vector< Seed > * seeds
 Seeds which are to be put in the forest. More...
 
const SnarlDistanceIndex * distance_index
 Distance index for the graph being represented. More...
 
vector< size_t > seed_sort_order
 Sort order for the seeds. More...
 
vector< sort_value_tsort_values_by_seed
 
vector< vector< child_info_t > > sibling_indices_at_depth
 
size_t active_tree_index
 
vector< pair< size_t, bool > > open_chains
 
forward_list< interval_state_tintervals_to_process
 
vector< interval_state_topen_intervals
 
size_t distance_limit
 The overall distance limit for splitting of new connected components. More...
 

Detailed Description

This stores information about the state of the forest as we fill it in.

Constructor & Destructor Documentation

◆ forest_growing_state_t()

vg::ZipCodeForest::forest_growing_state_t::forest_growing_state_t ( const vector< Seed > &  seeds,
const SnarlDistanceIndex &  distance_index,
size_t  distance_limit 
)
inline

Constructor given seeds and a distance index.

Member Data Documentation

◆ active_tree_index

size_t vg::ZipCodeForest::forest_growing_state_t::active_tree_index

We build a forest of trees. This is an index into trees to indicate which is actively being worked on.

A new tree is formed either when a new top-level chain is found (or a slice of a top-level chain if far enough from previous thing), or when part of a chain in a snarl is too far from everything else. In the second case, the entire subtree is found before determining that it should be a subtree, and is copied into a new ZipCodeTree. So only one tree is actively being added to at a time.

Note that this can't be an actual pointer to the forest because the address may move if the vectors get shifted around in memory.

◆ distance_index

const SnarlDistanceIndex* vg::ZipCodeForest::forest_growing_state_t::distance_index

Distance index for the graph being represented.

◆ distance_limit

size_t vg::ZipCodeForest::forest_growing_state_t::distance_limit

The overall distance limit for splitting of new connected components.

◆ intervals_to_process

forward_list<interval_state_t> vg::ZipCodeForest::forest_growing_state_t::intervals_to_process

A stack of intervals representing snarl tree nodes. These are yet to be sorted and added to the zip tree. After an interval is popped, child intervals are added in The stack structure ensures a depth-first processing order

◆ open_chains

vector<pair<size_t, bool> > vg::ZipCodeForest::forest_growing_state_t::open_chains

If part of a chain is unreachable with the rest of the chain, then we want to split it off into a separate zipcode tree. This tracks all open chains as an index to the start of the chain in the current active tree, and a boolean for if the chain start is farther than distance_limit from anything else in the snarl tree.

If the index is for a CHAIN_START, it includes the whole chain. If it points to a SEED/SNARL_START, then it is a slice.

Any time something gets added to a chain or the chain is closed, check if the distance to anything following is >distance_limit. If it is, copy from the start of the chain/slice into a new tree.

◆ open_intervals

vector<interval_state_t> vg::ZipCodeForest::forest_growing_state_t::open_intervals

Intervals that are currently open. These represent ancestors of whatever is currently being worked on. So the size is the depth of the snarl tree

◆ seed_sort_order

vector<size_t> vg::ZipCodeForest::forest_growing_state_t::seed_sort_order

Sort order for the seeds.

◆ seeds

const vector<Seed>* vg::ZipCodeForest::forest_growing_state_t::seeds

Seeds which are to be put in the forest.

◆ sibling_indices_at_depth

vector<vector<child_info_t> > vg::ZipCodeForest::forest_growing_state_t::sibling_indices_at_depth

Stores the previous things of the current structure at each depth The children are stored at the depth of their parents. For example, for a root chain, the vector at index 0 would have the chain start, seeds on the chain, and snarl starts on the chain. Similarly, for a top-level snarl, at depth 1, the second vector would contain the starts of chains at depth 2

◆ sort_values_by_seed

vector<sort_value_t> vg::ZipCodeForest::forest_growing_state_t::sort_values_by_seed

This stores the sort value and code type of each seed This will change as forest building progresses but it will be set for the relevant seed immediately before sorting The values also get used to calculate distance, as long as they have been set for the correct depth


The documentation for this struct was generated from the following file: