vg
tools for working with variation graphs
|
This stores information about the state of the forest as we fill it in. More...
Public Member Functions | |
forest_growing_state_t (const vector< Seed > &seeds, const SnarlDistanceIndex &distance_index, size_t distance_limit) | |
Constructor given seeds and a distance index. More... | |
Public Attributes | |
const vector< Seed > * | seeds |
Seeds which are to be put in the forest. More... | |
const SnarlDistanceIndex * | distance_index |
Distance index for the graph being represented. More... | |
vector< size_t > | seed_sort_order |
Sort order for the seeds. More... | |
vector< sort_value_t > | sort_values_by_seed |
vector< vector< child_info_t > > | sibling_indices_at_depth |
size_t | active_tree_index |
vector< pair< size_t, bool > > | open_chains |
forward_list< interval_state_t > | intervals_to_process |
vector< interval_state_t > | open_intervals |
size_t | distance_limit |
The overall distance limit for splitting of new connected components. More... | |
This stores information about the state of the forest as we fill it in.
|
inline |
Constructor given seeds and a distance index.
size_t vg::ZipCodeForest::forest_growing_state_t::active_tree_index |
We build a forest of trees. This is an index into trees to indicate which is actively being worked on.
A new tree is formed either when a new top-level chain is found (or a slice of a top-level chain if far enough from previous thing), or when part of a chain in a snarl is too far from everything else. In the second case, the entire subtree is found before determining that it should be a subtree, and is copied into a new ZipCodeTree. So only one tree is actively being added to at a time.
Note that this can't be an actual pointer to the forest because the address may move if the vectors get shifted around in memory.
const SnarlDistanceIndex* vg::ZipCodeForest::forest_growing_state_t::distance_index |
Distance index for the graph being represented.
size_t vg::ZipCodeForest::forest_growing_state_t::distance_limit |
The overall distance limit for splitting of new connected components.
forward_list<interval_state_t> vg::ZipCodeForest::forest_growing_state_t::intervals_to_process |
A stack of intervals representing snarl tree nodes. These are yet to be sorted and added to the zip tree. After an interval is popped, child intervals are added in The stack structure ensures a depth-first processing order
vector<pair<size_t, bool> > vg::ZipCodeForest::forest_growing_state_t::open_chains |
If part of a chain is unreachable with the rest of the chain, then we want to split it off into a separate zipcode tree. This tracks all open chains as an index to the start of the chain in the current active tree, and a boolean for if the chain start is farther than distance_limit from anything else in the snarl tree.
If the index is for a CHAIN_START, it includes the whole chain. If it points to a SEED/SNARL_START, then it is a slice.
Any time something gets added to a chain or the chain is closed, check if the distance to anything following is >distance_limit. If it is, copy from the start of the chain/slice into a new tree.
vector<interval_state_t> vg::ZipCodeForest::forest_growing_state_t::open_intervals |
Intervals that are currently open. These represent ancestors of whatever is currently being worked on. So the size is the depth of the snarl tree
vector<size_t> vg::ZipCodeForest::forest_growing_state_t::seed_sort_order |
Sort order for the seeds.
const vector<Seed>* vg::ZipCodeForest::forest_growing_state_t::seeds |
Seeds which are to be put in the forest.
vector<vector<child_info_t> > vg::ZipCodeForest::forest_growing_state_t::sibling_indices_at_depth |
Stores the previous things of the current structure at each depth The children are stored at the depth of their parents. For example, for a root chain, the vector at index 0 would have the chain start, seeds on the chain, and snarl starts on the chain. Similarly, for a top-level snarl, at depth 1, the second vector would contain the starts of chains at depth 2
vector<sort_value_t> vg::ZipCodeForest::forest_growing_state_t::sort_values_by_seed |
This stores the sort value and code type of each seed This will change as forest building progresses but it will be set for the relevant seed immediately before sorting The values also get used to calculate distance, as long as they have been set for the correct depth