vg
tools for working with variation graphs
Public Member Functions | Protected Member Functions | Protected Attributes | Static Protected Attributes | List of all members
vg::HTSWriter Class Reference

#include <hts_alignment_emitter.hpp>

Inheritance diagram for vg::HTSWriter:
vg::HTSAlignmentEmitter vg::MultipathAlignmentEmitter vg::SplicedHTSAlignmentEmitter

Public Member Functions

 HTSWriter (const string &filename, const string &format, const vector< pair< string, int64_t >> &path_order_and_length, const unordered_map< string, int64_t > &subpath_to_length, size_t max_threads)
 
 ~HTSWriter ()
 Tear down an HTSWriter and destroy HTSlib structures. More...
 
 HTSWriter (const HTSWriter &other)=delete
 
HTSWriteroperator= (const HTSWriter &other)=delete
 
 HTSWriter (HTSWriter &&other)=delete
 
HTSWriteroperator= (HTSWriter &&other)=delete
 

Protected Member Functions

void save_records (bam_hdr_t *header, vector< bam1_t * > &records, size_t thread_number)
 
bam_hdr_t * ensure_header (const string &read_group, const string &sample_name, size_t thread_number)
 
void initialize_sam_file (bam_hdr_t *header, size_t thread_number, bool keep_header=false)
 

Protected Attributes

unique_ptr< ofstream > out_file
 If we are doing output to a file, this will hold the open file. Otherwise (for stdout) it will be empty. More...
 
vg::io::StreamMultiplexer multiplexer
 
string format
 This holds our format name, for later error messages. More...
 
vector< pair< string, int64_t > > path_order_and_length
 Store the path names and lengths in the order to put them in the header. More...
 
unordered_map< string, int64_t > subpath_to_length
 
vector< hFILE * > backing_files
 
vector< samFile * > sam_files
 
atomic< bam_hdr_t * > atomic_header
 We need a header. More...
 
string sam_header
 
mutex header_mutex
 If the header isn't present when we want to write, we need a mutex to control creating it. More...
 
bool output_is_bgzf
 
string hts_mode
 Remember the HTSlib mode string we need to open our files. More...
 

Static Protected Attributes

static const size_t BGZF_FOOTER_LENGTH = 28
 We hack about with htslib's BGZF EOF footers, so we need to know how long they are. More...
 

Constructor & Destructor Documentation

◆ HTSWriter() [1/3]

vg::HTSWriter::HTSWriter ( const string &  filename,
const string &  format,
const vector< pair< string, int64_t >> &  path_order_and_length,
const unordered_map< string, int64_t > &  subpath_to_length,
size_t  max_threads 
)

Create an HTSWriter writing to the given file (or "-") in the given HTS format ("SAM", "BAM", "CRAM"). path_order_and_length must give each contig name and length to include in the header. Sample names and read groups for the header will be guessed from the first reads. HTSlib positions will be read from the alignments' refpos, and the alignments must be surjected.

◆ ~HTSWriter()

vg::HTSWriter::~HTSWriter ( )

Tear down an HTSWriter and destroy HTSlib structures.

◆ HTSWriter() [2/3]

vg::HTSWriter::HTSWriter ( const HTSWriter other)
delete

◆ HTSWriter() [3/3]

vg::HTSWriter::HTSWriter ( HTSWriter &&  other)
delete

Member Function Documentation

◆ ensure_header()

bam_hdr_t * vg::HTSWriter::ensure_header ( const string &  read_group,
const string &  sample_name,
size_t  thread_number 
)
protected

Make sure that the HTS header has been written, and the samFile* in sam_files has been created for the given thread.

If the header has not been written, blocks until it has been written.

If we end up being the thread to write it, sniff header information from the given alignment.

Returns the header pointer, so we don't have to do another atomic read later.

◆ initialize_sam_file()

void vg::HTSWriter::initialize_sam_file ( bam_hdr_t *  header,
size_t  thread_number,
bool  keep_header = false 
)
protected

Given a header and a thread number, make sure the samFile* for that thread is initialized and ready to have alignments written to it. If true, actually writes the given header into the output file created by the multiplexer. If the samFile* was already initialized, flushes it out and makes a breakpoint.

◆ operator=() [1/2]

HTSWriter& vg::HTSWriter::operator= ( const HTSWriter other)
delete

◆ operator=() [2/2]

HTSWriter& vg::HTSWriter::operator= ( HTSWriter &&  other)
delete

◆ save_records()

void vg::HTSWriter::save_records ( bam_hdr_t *  header,
vector< bam1_t * > &  records,
size_t  thread_number 
)
protected

Write and deallocate a bunch of BAM records. Takes care of locking the file. Header must have been written already.

Member Data Documentation

◆ atomic_header

atomic<bam_hdr_t*> vg::HTSWriter::atomic_header
protected

We need a header.

◆ backing_files

vector<hFILE*> vg::HTSWriter::backing_files
protected

To back our samFile*s, we need the hFILE* objects wrapping our C++ streams. We need to manually flush these after HTS headers are written, since bgzf_flush, which samtools calls, closes a BGZF block and sends the data to the hFILE* but does not actually flush the hFILE*. These will be pointers to the hFILE* for each thread's samFile*. We may only use them while the samFile* they belong to is still open; closing the samFile* will free the hFILE* but not null it out of this vector.

◆ BGZF_FOOTER_LENGTH

const size_t vg::HTSWriter::BGZF_FOOTER_LENGTH = 28
staticprotected

We hack about with htslib's BGZF EOF footers, so we need to know how long they are.

◆ format

string vg::HTSWriter::format
protected

This holds our format name, for later error messages.

◆ header_mutex

mutex vg::HTSWriter::header_mutex
protected

If the header isn't present when we want to write, we need a mutex to control creating it.

◆ hts_mode

string vg::HTSWriter::hts_mode
protected

Remember the HTSlib mode string we need to open our files.

◆ multiplexer

vg::io::StreamMultiplexer vg::HTSWriter::multiplexer
protected

This holds a StreamMultiplexer on the output stream, for sharing it between threads.

◆ out_file

unique_ptr<ofstream> vg::HTSWriter::out_file
protected

If we are doing output to a file, this will hold the open file. Otherwise (for stdout) it will be empty.

◆ output_is_bgzf

bool vg::HTSWriter::output_is_bgzf
protected

Remember if we are outputting BGZF-compressed data or not. If we are, we trim off spurious EOF markers and append our own.

◆ path_order_and_length

vector<pair<string, int64_t> > vg::HTSWriter::path_order_and_length
protected

Store the path names and lengths in the order to put them in the header.

◆ sam_files

vector<samFile*> vg::HTSWriter::sam_files
protected

We make one samFile* per thread, on each thread's output stream form the multiplexer. As soon as we create them, we show them the header, so they are initialized properly. If they have not yet been filled in (because the header is not ready yet), they are null.

◆ sam_header

string vg::HTSWriter::sam_header
protected

We also need a header string. Not atomic, because by the time we read it we know the header is ready and nobody is writing to it.

◆ subpath_to_length

unordered_map<string, int64_t> vg::HTSWriter::subpath_to_length
protected

With subpath support, the above list will store base path inoformation for the header The actual path lengths go here:


The documentation for this class was generated from the following files: