vg
tools for working with variation graphs
Classes | Typedefs | Functions | Variables
minimizer_main.cpp File Reference
#include "subcommand.hpp"
#include <vg/io/vpkg.hpp>
#include <algorithm>
#include <iostream>
#include <vector>
#include <getopt.h>
#include <omp.h>
#include "../gbwtgraph_helper.hpp"
#include "../gbwt_helper.hpp"
#include "../index_registry.hpp"
#include "../utility.hpp"
#include "../handle.hpp"
#include "../snarl_distance_index.hpp"
#include "../zip_code.hpp"
#include <gbwtgraph/index.h>

Classes

struct  MinimizerConfig
 

Typedefs

using code_type = gbwtgraph::KmerEncoding::code_type
 
using payload_type = ZipCode::payload_type
 

Functions

int get_default_threads ()
 
int main_minimizer (int argc, char **argv)
 
void help_minimizer (char **argv)
 

Variables

constexpr int DEFAULT_MAX_THREADS = 16
 

Detailed Description

Defines the "vg minimizer" subcommand, which builds the minimizer index.

The index contains the lexicographically smallest kmer in a window of w successive kmers and their reverse complements. If the kmer contains characters other than A, C, G, and T, it will not be indexed.

The index contains either all or haplotype-consistent minimizers. Indexing all minimizers from complex graph regions can take a long time (e.g. tens of hours vs 5-10 minutes for 1000GP), because many windows have the same minimizer. As the total number of minimizers is manageable (e.g. 1.5x more for 1000GP) it should be possible to develop a better algorithm for finding the minimizers.

A quick idea for indexing the entire graph:

Typedef Documentation

◆ code_type

using code_type = gbwtgraph::KmerEncoding::code_type

◆ payload_type

Function Documentation

◆ get_default_threads()

int get_default_threads ( )

◆ help_minimizer()

void help_minimizer ( char **  argv)

◆ main_minimizer()

int main_minimizer ( int  argc,
char **  argv 
)

Variable Documentation

◆ DEFAULT_MAX_THREADS

constexpr int DEFAULT_MAX_THREADS = 16
constexpr