vg
tools for working with variation graphs
|
#include <message_emitter.hpp>
Public Types | |
using | group_listener_t = function< void(const string &, int64_t, int64_t)> |
Public Member Functions | |
MessageEmitter (ostream &out, bool compress=false, size_t max_group_size=1000) | |
~MessageEmitter () | |
Destructor that finishes the file. More... | |
MessageEmitter (const MessageEmitter &other)=delete | |
MessageEmitter & | operator= (const MessageEmitter &other)=delete |
MessageEmitter (MessageEmitter &&other)=default | |
MessageEmitter & | operator= (MessageEmitter &&other)=default |
void | write (const string &tag) |
void | write (const string &tag, string &&message) |
Emit the given message with the given type tag. More... | |
void | write_copy (const string &tag, const string &message) |
void | on_group (group_listener_t &&listener) |
void | emit_group () |
void | flush () |
Static Public Attributes | |
const static size_t | MAX_MESSAGE_SIZE = 1000000000 |
We refuse to serialize individual messages longer than this size. More... | |
Private Attributes | |
string | group_tag |
vector< string > | group |
This is our internal buffer. More... | |
size_t | max_group_size |
This is how big we let it get before we dump it. More... | |
unique_ptr< BlockedGzipOutputStream > | bgzip_out |
unique_ptr< google::protobuf::io::OstreamOutputStream > | uncompressed_out |
This holds the non-BGZF output stream, if we aren't writing compressed data. More... | |
ostream * | uncompressed_out_ostream |
size_t | uncompressed_out_written |
We need to track the total bytes written by previous OstreamOutputStreams. More... | |
vector< group_listener_t > | group_handlers |
If someone wants to listen in on emitted groups, they can register a handler. More... | |
Class that wraps an output stream and allows emitting groups of binary messages to it, with internal buffering. Handles finishing the file on its own, and allows tracking of (possibly virtual) offsets within a non-seekable stream (as long as the entire stream is controlled by one instance). Cannot be copied, but can be moved.
Each group consists of:
This format is designed to be syntactically the same as the old untagged VG Protobuf format, to allow easy sniffing/reading of old files.
Can call callbacks with the groups emitted and their virtual offsets, for indexing purposes.
Note that the callbacks may be called by the object's destructor, so anything they reference needs to outlive the object.
Not thread-safe. May be more efficient than repeated write/write_buffered calls because a single BGZF stream can be used.
Can write either compressed or uncompressed but framed data.
using vg::io::MessageEmitter::group_listener_t = function<void(const string&, int64_t, int64_t)> |
Define a type for group emission event listeners. Arguments are: type tag, start virtual offset, and past-end virtual offset.
vg::io::MessageEmitter::MessageEmitter | ( | ostream & | out, |
bool | compress = false , |
||
size_t | max_group_size = 1000 |
||
) |
Constructor. Write output to the given stream. If compress is true, compress it as BGZF. Limit the maximum number of messages in a group to max_group_size.
If not compressing, virtual offsets are just ordinary offsets.
vg::io::MessageEmitter::~MessageEmitter | ( | ) |
Destructor that finishes the file.
|
delete |
|
default |
void vg::io::MessageEmitter::emit_group | ( | ) |
Actually write out everything in the buffer. Doesn't actually flush the underlying streams to disk. Assumes that no more than one group's worth of messages are in the buffer.
void vg::io::MessageEmitter::flush | ( | ) |
Write out anything in the buffer, and flush the backing streams. After this has been called, a full BGZF block will be in the backing stream (passed to the constructor), but the backing stream won't necessarily be flushed.
void vg::io::MessageEmitter::on_group | ( | group_listener_t && | listener | ) |
Add an event listener that listens for emitted groups. The listener will be called with the type tag, the start virtual offset, and the past-end virtual offset. Moves the function passed in. Anything the function uses by reference must outlive this object!
|
delete |
|
default |
void vg::io::MessageEmitter::write | ( | const string & | tag | ) |
Ensure that a (possibly empty) group is emitted for the given tag. Will coalesce with previous or subsequent write calls for the same tag. Note that empty tags are prohibited.
void vg::io::MessageEmitter::write | ( | const string & | tag, |
string && | message | ||
) |
Emit the given message with the given type tag.
void vg::io::MessageEmitter::write_copy | ( | const string & | tag, |
const string & | message | ||
) |
Emit a copy of the given message with the given type tag. To use when you have something you can't move.
|
private |
This holds the BGZF output stream, if we are writing BGZF. Since Protobuf streams can't be copied or moved, we wrap ours in a uniqueptr_t so we can be moved.
|
private |
This is our internal buffer.
|
private |
If someone wants to listen in on emitted groups, they can register a handler.
|
private |
This is our internal tag string for what is in our buffer. If it is empty, no group is buffered, because empty tags are prohibited.
|
private |
This is how big we let it get before we dump it.
|
static |
We refuse to serialize individual messages longer than this size.
|
private |
This holds the non-BGZF output stream, if we aren't writing compressed data.
|
private |
When writing uncompressed data, we can't flush the OstreamOutputStream's internal buffer. So we need to destroy and recreate it, so we need the ostream pointer.
|
private |
We need to track the total bytes written by previous OstreamOutputStreams.