vg
tools for working with variation graphs
|
#include <funnel.hpp>
Classes | |
struct | FilterPerformance |
struct | Item |
Represents an Item whose provenance we track. More... | |
struct | PaintableSpace |
struct | Stage |
Represents a Stage which is a series of Items, which track their own provenance. More... | |
Public Types | |
enum | State { State::NONE = 0, State::PLACED = 1, State::CORRECT = 2 } |
We can tag items as having one of these states. More... | |
Public Member Functions | |
void | start (const string &name) |
void | stop () |
void | stage (const string &name) |
void | stage_stop () |
Stop the current stage. More... | |
void | substage (const string &name) |
void | substage_stop () |
Stop the current substage. More... | |
void | processing_input (size_t prev_stage_item) |
Start processing the given item coming from the previous stage. More... | |
void | processed_input () |
Stop processing an item from the previous stage. More... | |
void | producing_output (size_t item) |
Start producing the given output item, whether it has been projected yet or not. More... | |
void | produced_output () |
Stop producing an output item. More... | |
void | introduce (size_t count=1) |
Introduce the given number of new items, starting their own lines of provenance (default 1). More... | |
void | expand (size_t prev_stage_item, size_t count) |
Expand the given item from the previous stage into the given number of new items at this stage. More... | |
template<typename Iterator > | |
void | merge_group (Iterator prev_stage_items_begin, Iterator prev_stage_items_end) |
template<typename Iterator > | |
void | merge_groups (Iterator prev_stage_items_begin, Iterator prev_stage_items_end) |
template<typename Iterator > | |
void | merge (Iterator prev_stage_items_begin, Iterator prev_stage_items_end) |
template<typename Iterator > | |
void | also_merge_group (Iterator prev_stage_items_begin, Iterator prev_stage_items_end) |
template<typename Iterator > | |
void | also_merge_group (size_t earlier_stage_lookback, Iterator earlier_stage_items_begin, Iterator earlier_stage_items_end) |
void | also_relevant (size_t earlier_stage_lookback, size_t earlier_stage_item) |
void | project (size_t prev_stage_item) |
Project a single item from the previous stage to a single non-group item at this stage. More... | |
void | project_group (size_t prev_stage_item, size_t group_size) |
Project a single item from the previous stage to a new group item at the current stage, with the given size. More... | |
void | fail (const char *filter, size_t prev_stage_item, double statistic=nan("")) |
void | pass (const char *filter, size_t prev_stage_item, double statistic=nan("")) |
void | score (size_t item, double score) |
Assign the given score to the given item at the current stage. More... | |
void | tag (size_t item, State state, size_t tag_start=0, size_t tag_length=std::numeric_limits< size_t >::max()) |
void | tag_correct (size_t item, size_t tag_start=0, size_t tag_length=std::numeric_limits< size_t >::max()) |
bool | is_correct (size_t item) const |
bool | was_correct (size_t prev_stage_item) const |
bool | was_correct (size_t prev_stage_index, const string &prev_stage_name, size_t prev_stage_item) const |
string | last_correct_stage (size_t tag_start=0, size_t tag_length=std::numeric_limits< size_t >::max()) const |
string | last_tagged_stage (State tag, size_t tag_start=0, size_t tag_length=std::numeric_limits< size_t >::max()) const |
size_t | latest () const |
Get the index of the most recent item created in the current stage. More... | |
void | for_each_stage (const function< void(const string &, const vector< size_t > &, const double &)> &callback) const |
void | for_each_filter (const function< void(const string &, const string &, const FilterPerformance &, const FilterPerformance &, const vector< double > &, const vector< double > &)> &callback) const |
void | to_dot (ostream &out) const |
void | annotate_mapped_alignment (Alignment &aln, bool annotate_correctness) const |
Protected Types | |
using | clock = std::chrono::high_resolution_clock |
Pick a clock to use for measuring stage duration. More... | |
using | time_point = clock::time_point |
And a type to represent stage transition times. More... | |
Protected Member Functions | |
Item & | get_item (size_t index) |
size_t | create_item () |
Protected Attributes | |
string | funnel_name |
What's the name of the funnel we start()-ed. Will be empty if nothing is running. More... | |
time_point | start_time |
At what time did we start() More... | |
time_point | stop_time |
At what time did we stop() More... | |
string | stage_name |
What's the name of the current stage? Will be empty if no stage is running. More... | |
time_point | stage_start_time |
At what time did the stage start? More... | |
string | substage_name |
What's the name of the current substage? Will be empty if no substage is running. More... | |
size_t | input_in_progress = numeric_limits<size_t>::max() |
size_t | output_in_progress = numeric_limits<size_t>::max() |
vector< Stage > | stages |
Represents a record of an invocation of a pipeline for an input.
Tracks the history of "lines" of data "item" provenance through a series of "stages", containing a series of "filters".
Lines are "introduced", and "project" from earlier stages to later stages, possibly "expanding" or "merging", until they "fail" a filter or reach the final stage. At each stage, items occur in a linear order and are identified by index.
An item may be a "group", with a certain size.
We also can assign "scores" or correctness/placed-ness "tags" to items at a stage. Tags can cover a region of a linear read space.
|
protected |
Pick a clock to use for measuring stage duration.
|
protected |
And a type to represent stage transition times.
|
strong |
void vg::Funnel::also_merge_group | ( | Iterator | prev_stage_items_begin, |
Iterator | prev_stage_items_end | ||
) |
Record extra provenance relationships where the latest current-stage item came from the given previous-stage items. Increases the current-stage item group size by the number of previous-stage items added.
Propagates tagging.
void vg::Funnel::also_merge_group | ( | size_t | earlier_stage_lookback, |
Iterator | earlier_stage_items_begin, | ||
Iterator | earlier_stage_items_end | ||
) |
Record extra provenance relationships where the latest current-stage item came from the given earlier-stage items. Increases the current-stage item group size by the number of previous-stage items added.
Propagates tagging.
earlier_stage_lookback determines how many stages to look back and must be 1 or more.
void vg::Funnel::also_relevant | ( | size_t | earlier_stage_lookback, |
size_t | earlier_stage_item | ||
) |
Record an extra provenance relationship where the latest current-stage item came from the given previous-stage item, the given number of stages ago (min 1).
Does not adjust group size or propagate tagging.
void vg::Funnel::annotate_mapped_alignment | ( | Alignment & | aln, |
bool | annotate_correctness | ||
) | const |
Set an alignments annotations with the number of results at each stage if annotate_correctness is true, also annotate the alignment with the number of correct results at each stage. This assumes that we've been tracking correctness all along
|
protected |
Create a new item in the current stage and get its index. Advances the projected count counter.
void vg::Funnel::expand | ( | size_t | prev_stage_item, |
size_t | count | ||
) |
Expand the given item from the previous stage into the given number of new items at this stage.
void vg::Funnel::fail | ( | const char * | filter, |
size_t | prev_stage_item, | ||
double | statistic = nan("") |
||
) |
Fail the given item from the previous stage on the given filter and do not project it through to this stage. Items which do not fail a filter must pass the filter and be projected to something. The filter name must survive the funnel, because a pointer to it will be stored. Allows a statistic for the filtered-on value for the failing item to be recorded.
void vg::Funnel::for_each_filter | ( | const function< void(const string &, const string &, const FilterPerformance &, const FilterPerformance &, const vector< double > &, const vector< double > &)> & | callback | ) | const |
Call the given callback with stage name, filter name, performance report for items, performance report for total size of items, values for correct items for the filter statistic, and values for incorrect (or merely not known-correct) items for the filter statistic. Runs the callback for each stage and filter, in order. Only includes filters that were actually passed or failed by any items.
void vg::Funnel::for_each_stage | ( | const function< void(const string &, const vector< size_t > &, const double &)> & | callback | ) | const |
Call the given callback with stage name, and vector of result item sizes at that stage, and a duration in seconds, for each stage.
|
protected |
Ensure an item with the given index exists in the current stage and return a reference to it. We need to do it this way because we might save a production duration before an item is really projected. The items of the current stage should only be modified through this. Note that you do not need to create an item in order to get it.
void vg::Funnel::introduce | ( | size_t | count = 1 | ) |
Introduce the given number of new items, starting their own lines of provenance (default 1).
bool vg::Funnel::is_correct | ( | size_t | item | ) | const |
Return true if the given item at this stage is tagged correct, or descends from an item that was tagged correct.
string vg::Funnel::last_correct_stage | ( | size_t | tag_start = 0 , |
size_t | tag_length = std::numeric_limits<size_t>::max() |
||
) | const |
Get the name of the most recent stage that had a correct-tagged item survive into it, or "none" if no items were ever tagged correct. Optionally allows specifying a read space interval to intersect with items, so the query returns the last stage that had a correct item intersecting that range.
string vg::Funnel::last_tagged_stage | ( | State | tag, |
size_t | tag_start = 0 , |
||
size_t | tag_length = std::numeric_limits<size_t>::max() |
||
) | const |
Get the name of the most recent stage that had a n item tagged with the given tag or better survive into it, or "none" if no items were ever tagged that good. Optionally allows specifying a read space interval to intersect with items, so the query returns the last stage that had an item intersecting that range and also an item witht hat tag or better.
TODO: Make worse tag ranges not match queries for better tags!
size_t vg::Funnel::latest | ( | ) | const |
Get the index of the most recent item created in the current stage.
void vg::Funnel::merge | ( | Iterator | prev_stage_items_begin, |
Iterator | prev_stage_items_end | ||
) |
Merge all the given item indexes from the previous stage into a new item at this stage. The new item will be a single item.
void vg::Funnel::merge_group | ( | Iterator | prev_stage_items_begin, |
Iterator | prev_stage_items_end | ||
) |
Merge all the given item indexes from the previous stage into a new item at this stage. The new item will be a group, sized according to the number of previous items merged.
void vg::Funnel::merge_groups | ( | Iterator | prev_stage_items_begin, |
Iterator | prev_stage_items_end | ||
) |
Merge all the given item indexes from the previous stage into a new item at this stage. The new item will be a group, sized according to the total size of previous groups, with non-groups counting as size 1.
void vg::Funnel::pass | ( | const char * | filter, |
size_t | prev_stage_item, | ||
double | statistic = nan("") |
||
) |
Pass the given item from the previous stage through the given filter at this stage. Items which do not pass a filter must fail it. All items which pass filters must do so in the same order. The filter name must survive the funnel, because a pointer to it will be stored. Allows a statistic for the filtered-on value for the passing item to be recorded.
void vg::Funnel::processed_input | ( | ) |
Stop processing an item from the previous stage.
void vg::Funnel::processing_input | ( | size_t | prev_stage_item | ) |
Start processing the given item coming from the previous stage.
void vg::Funnel::produced_output | ( | ) |
Stop producing an output item.
void vg::Funnel::producing_output | ( | size_t | item | ) |
Start producing the given output item, whether it has been projected yet or not.
void vg::Funnel::project | ( | size_t | prev_stage_item | ) |
Project a single item from the previous stage to a single non-group item at this stage.
void vg::Funnel::project_group | ( | size_t | prev_stage_item, |
size_t | group_size | ||
) |
Project a single item from the previous stage to a new group item at the current stage, with the given size.
void vg::Funnel::score | ( | size_t | item, |
double | score | ||
) |
Assign the given score to the given item at the current stage.
void vg::Funnel::stage | ( | const string & | name | ) |
Start the given stage, and end all previous stages and substages. Name must not be empty. Multiple stages with the same name will be coalesced.
void vg::Funnel::stage_stop | ( | ) |
Stop the current stage.
void vg::Funnel::start | ( | const string & | name | ) |
Start processing the given named input. Name must not be empty. No stage or substage will be active.
void vg::Funnel::stop | ( | ) |
Stop processing the given named input. All stages and substages are stopped.
void vg::Funnel::substage | ( | const string & | name | ) |
Start the given substage, nested insude the current stage. End all previous substages. Substages within a stage may repeat and are coalesced. Name must not be empty.
void vg::Funnel::substage_stop | ( | ) |
Stop the current substage.
void vg::Funnel::tag | ( | size_t | item, |
State | state, | ||
size_t | tag_start = 0 , |
||
size_t | tag_length = std::numeric_limits<size_t>::max() |
||
) |
Tag the given item as being in the given state at the current stage. Future items that derive from it will inherit these tags. Optionally allows specifying that the state extends over a range in read space.
void vg::Funnel::tag_correct | ( | size_t | item, |
size_t | tag_start = 0 , |
||
size_t | tag_length = std::numeric_limits<size_t>::max() |
||
) |
Tag the given item as "correct" at the current stage. Future items that derive from it will also be tagged as correct. Optionally allows specifying that the correctness extends over a range in read space, so correctness can be tracked as a property of regions of the read, rather than the whole read. If called multiple times, with different bounds, the correct region will enclose all the correct regions provided in the different calls.
void vg::Funnel::to_dot | ( | ostream & | out | ) | const |
Dump information from the Funnel as a dot-format Graphviz graph to the given stream. Illustrates stages and provenance.
bool vg::Funnel::was_correct | ( | size_t | prev_stage_index, |
const string & | prev_stage_name, | ||
size_t | prev_stage_item | ||
) | const |
Return true if the given item at the given named previous stage is tagged correct, or descends from an item that was tagged correct. Needs a hint about what number the stage was in the order, to make lookup fast.
bool vg::Funnel::was_correct | ( | size_t | prev_stage_item | ) | const |
Return true if the given item at the previous stage is tagged correct, or descends from an item that was tagged correct.
|
protected |
What's the name of the funnel we start()-ed. Will be empty if nothing is running.
|
protected |
What's the current prev-stage input we are processing? Will be numeric_limits<size_t>::max() if none.
|
protected |
what's the current current-stage output we are generating? Will be numeric_limits<size_t>::max() if none.
|
protected |
What's the name of the current stage? Will be empty if no stage is running.
|
protected |
At what time did the stage start?
|
protected |
Rercord all the stages, including their names and item provenance. Handles repeated stages.
|
protected |
At what time did we start()
|
protected |
At what time did we stop()
|
protected |
What's the name of the current substage? Will be empty if no substage is running.