vg
tools for working with variation graphs
Public Types | Public Member Functions | Static Public Attributes | Private Attributes | List of all members
vg::io::MessageEmitter Class Reference

#include <message_emitter.hpp>

Public Types

using group_listener_t = function< void(const string &, int64_t, int64_t)>
 

Public Member Functions

 MessageEmitter (ostream &out, bool compress=false, size_t max_group_size=1000)
 
 ~MessageEmitter ()
 Destructor that finishes the file. More...
 
 MessageEmitter (const MessageEmitter &other)=delete
 
MessageEmitteroperator= (const MessageEmitter &other)=delete
 
 MessageEmitter (MessageEmitter &&other)=default
 
MessageEmitteroperator= (MessageEmitter &&other)=default
 
void write (const string &tag)
 
void write (const string &tag, string &&message)
 Emit the given message with the given type tag. More...
 
void write_copy (const string &tag, const string &message)
 
void on_group (group_listener_t &&listener)
 
void emit_group ()
 
void flush ()
 

Static Public Attributes

const static size_t MAX_MESSAGE_SIZE = 1000000000
 We refuse to serialize individual messages longer than this size. More...
 

Private Attributes

string group_tag
 
vector< string > group
 This is our internal buffer. More...
 
size_t max_group_size
 This is how big we let it get before we dump it. More...
 
unique_ptr< BlockedGzipOutputStreambgzip_out
 
unique_ptr< google::protobuf::io::OstreamOutputStream > uncompressed_out
 This holds the non-BGZF output stream, if we aren't writing compressed data. More...
 
ostream * uncompressed_out_ostream
 
size_t uncompressed_out_written
 We need to track the total bytes written by previous OstreamOutputStreams. More...
 
vector< group_listener_tgroup_handlers
 If someone wants to listen in on emitted groups, they can register a handler. More...
 

Detailed Description

Class that wraps an output stream and allows emitting groups of binary messages to it, with internal buffering. Handles finishing the file on its own, and allows tracking of (possibly virtual) offsets within a non-seekable stream (as long as the entire stream is controlled by one instance). Cannot be copied, but can be moved.

Each group consists of:

This format is designed to be syntactically the same as the old untagged VG Protobuf format, to allow easy sniffing/reading of old files.

Can call callbacks with the groups emitted and their virtual offsets, for indexing purposes.

Note that the callbacks may be called by the object's destructor, so anything they reference needs to outlive the object.

Not thread-safe. May be more efficient than repeated write/write_buffered calls because a single BGZF stream can be used.

Can write either compressed or uncompressed but framed data.

Member Typedef Documentation

◆ group_listener_t

using vg::io::MessageEmitter::group_listener_t = function<void(const string&, int64_t, int64_t)>

Define a type for group emission event listeners. Arguments are: type tag, start virtual offset, and past-end virtual offset.

Constructor & Destructor Documentation

◆ MessageEmitter() [1/3]

vg::io::MessageEmitter::MessageEmitter ( ostream &  out,
bool  compress = false,
size_t  max_group_size = 1000 
)

Constructor. Write output to the given stream. If compress is true, compress it as BGZF. Limit the maximum number of messages in a group to max_group_size.

If not compressing, virtual offsets are just ordinary offsets.

◆ ~MessageEmitter()

vg::io::MessageEmitter::~MessageEmitter ( )

Destructor that finishes the file.

◆ MessageEmitter() [2/3]

vg::io::MessageEmitter::MessageEmitter ( const MessageEmitter other)
delete

◆ MessageEmitter() [3/3]

vg::io::MessageEmitter::MessageEmitter ( MessageEmitter &&  other)
default

Member Function Documentation

◆ emit_group()

void vg::io::MessageEmitter::emit_group ( )

Actually write out everything in the buffer. Doesn't actually flush the underlying streams to disk. Assumes that no more than one group's worth of messages are in the buffer.

◆ flush()

void vg::io::MessageEmitter::flush ( )

Write out anything in the buffer, and flush the backing streams. After this has been called, a full BGZF block will be in the backing stream (passed to the constructor), but the backing stream won't necessarily be flushed.

◆ on_group()

void vg::io::MessageEmitter::on_group ( group_listener_t &&  listener)

Add an event listener that listens for emitted groups. The listener will be called with the type tag, the start virtual offset, and the past-end virtual offset. Moves the function passed in. Anything the function uses by reference must outlive this object!

◆ operator=() [1/2]

MessageEmitter& vg::io::MessageEmitter::operator= ( const MessageEmitter other)
delete

◆ operator=() [2/2]

MessageEmitter& vg::io::MessageEmitter::operator= ( MessageEmitter &&  other)
default

◆ write() [1/2]

void vg::io::MessageEmitter::write ( const string &  tag)

Ensure that a (possibly empty) group is emitted for the given tag. Will coalesce with previous or subsequent write calls for the same tag. Note that empty tags are prohibited.

◆ write() [2/2]

void vg::io::MessageEmitter::write ( const string &  tag,
string &&  message 
)

Emit the given message with the given type tag.

◆ write_copy()

void vg::io::MessageEmitter::write_copy ( const string &  tag,
const string &  message 
)

Emit a copy of the given message with the given type tag. To use when you have something you can't move.

Member Data Documentation

◆ bgzip_out

unique_ptr<BlockedGzipOutputStream> vg::io::MessageEmitter::bgzip_out
private

This holds the BGZF output stream, if we are writing BGZF. Since Protobuf streams can't be copied or moved, we wrap ours in a uniqueptr_t so we can be moved.

◆ group

vector<string> vg::io::MessageEmitter::group
private

This is our internal buffer.

◆ group_handlers

vector<group_listener_t> vg::io::MessageEmitter::group_handlers
private

If someone wants to listen in on emitted groups, they can register a handler.

◆ group_tag

string vg::io::MessageEmitter::group_tag
private

This is our internal tag string for what is in our buffer. If it is empty, no group is buffered, because empty tags are prohibited.

◆ max_group_size

size_t vg::io::MessageEmitter::max_group_size
private

This is how big we let it get before we dump it.

◆ MAX_MESSAGE_SIZE

const size_t vg::io::MessageEmitter::MAX_MESSAGE_SIZE = 1000000000
static

We refuse to serialize individual messages longer than this size.

◆ uncompressed_out

unique_ptr<google::protobuf::io::OstreamOutputStream> vg::io::MessageEmitter::uncompressed_out
private

This holds the non-BGZF output stream, if we aren't writing compressed data.

◆ uncompressed_out_ostream

ostream* vg::io::MessageEmitter::uncompressed_out_ostream
private

When writing uncompressed data, we can't flush the OstreamOutputStream's internal buffer. So we need to destroy and recreate it, so we need the ostream pointer.

◆ uncompressed_out_written

size_t vg::io::MessageEmitter::uncompressed_out_written
private

We need to track the total bytes written by previous OstreamOutputStreams.


The documentation for this class was generated from the following files: