Skip to content

Network Publications - Sending Messages

Initial State

A newly created Publication will have some counters in cnc.dat and a Publication log buffer, which looks like this:

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt isConnected: false activeTermCount: 0 initialTermId: 1005 TC0, TC1, TC2 Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: pub-pos, pub-lmt snd-pos, snd-lmt, snd-bpe

The Terms are empty (zeroed).

In the metadata section, isConnected is false, activeTermCount is 0, and initialTermId is set to a random number. The metadata contains the tail counters TC0, TC1 and TC2, but they're also shown in the diagram next to where they point to. Each tail counter has the TermId in the top 32 bits and the TermOffset in the low 32 bits, so TC0 will start at 0x3ed00000000 (TermId 1005, TermOffset 0).

  • pub-pos is the publisher's position in the Publication log buffer, i.e. after the last Frame written to the log buffer. This is in absolute number of bytes (starting at zero). It is recorded in the cnc.dat file for info, but is not used as input into anything. It is easier to see it with all the other counters than to have to dig around in the log buffer to find the current tail counter.
  • pub-lmt is the limit (max position) that the publisher can write to. If publishing a new message would take pub-pos beyond pub-lmt, the message would not be added and BACK_PRESSURED returned instead.
  • snd-pos is the Sender's position, which is the position the Sender has sent up to (in absolute number of bytes).
  • snd-lmt is the limit that the Sender can send up to, which is controlled by the space available on the receiving side. This is updated based on information in SMs. If sending more data would take snd-pos beyond snd-lmt, the Sender would not send, to avoid overwhelming the Receiver, but would increment snd-bpe instead.
  • snd-bpe (Sender back-pressure events) is a count of how many times the Sender couldn't send because there wasn't enough space on the Receiver. It is recorded for monitoring only - it doesn't input into anything. Whenever snd-bpe is incremented for a Publication, the system-wide SENDER_FLOW_CONTROL_LIMITS counter is also incremented - this provides a single counter that can be checked before digging into Publication-specific counters.

Significant Events

There are four significant events that can affect the log buffer:

  • the application attempts to publish a message
  • the Sender attempts to send some data
  • the Sender receives a Status Message and updates some counters
  • the Driver Conductor updates some counters and / or cleans part of the log buffer

The application, Sender and Driver Conductor usually run on different threads, so these could occur in any order, or even at the same time.

If the application attempted to publish a message at this stage, it would fail because pub-pos is at pub-lmt. It would be returned NOT_CONNECTED because isConnected is false. If it was connected, BACK_PRESSURED would have been returned.

The only thing that can affect the log buffer at this stage is receiving an SM.

Sender Receives a Status Message

Most data travels from the Sender to the Receiver. The Receiver occasionally responds with a Status Message (SM), which provides flow control and back-pressure. An SM is not sent to ack each individual packet. An SM is also sent in response to a SETUP message, to establish the connection.

A Status Message contains the minimum Subscriber position (slowest), which I'll refer to as min(sub-pos). It also contains receiverWindowLength, which, unless overridden, is Configuration.INITIAL_WINDOW_LENGTH_DEFAULT (128KB), or half a term length, whichever is smallest.

When the Sender receives an SM, it sets isConnected to true. It puts min(sub-pos) and receiverWindowLength into a flow control algorithm (e.g. UnicastFlowControl) and sets snd-lmt to the result. In other words, if a Subscription on the receiving side falls behind, it affects how much the Sender can send.

We're not tracking Subscriber counters on this diagram as they live on the receiving side, but min(sub-pos) will be zero at this point, because we haven't sent any messages yet.

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt min(sub-pos) activeTermCount: 0 initialTermId: 1005 TC0, TC1, TC2 isConnected: true Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: pub-pos, pub-lmt snd-pos, , snd-bpe snd-lmt rcvWindowLength

The Publication is now connected. The SM has updated snd-lmt, so the Sender has some space to send, but the application still can't publish messages yet, because pub-lmt is still zero. If it tried, it would now be returned BACK_PRESSURED because isConnected is true. The Driver Conductor needs to run, to update pub-lmt.

Driver Conductor updates Publisher Counters

When the Driver Conductor runs, it asks each NetworkPublication to update its pub-pos and pub-lmt counters.

It sets pub-pos to the current tail counter TC0. This would make more sense if the application had published some messages, because pub-pos is already the same value as TC0. The application advances the tail counters as it publishes messages and the Driver Conductor updates pub-pos to the latest tail counter in the background.

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt isConnected: true activeTermCount: 0 initialTermId: 1005 TC0, TC1, TC2 Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: snd-pos, snd-lmt, snd-bpe pub-pos, pub-lmt termWindowLength

Now that isConnected is true, the NetworkPublication also sets pub-lmt to snd-pos + termWindowLength (half a term). As the Sender's position advances, so does the publisher's limit. Now the application can publish messages from pub-pos to pub-lmt and the Sender can send messages from snd-pos to snd-lmt.

The Driver Conductor also cleans old messages, but we'll describe that later, once there are some.

Application Publishes Messages

When the application calls into the Publication to publish a message, the Publication bumps the tail counter to reserve space for the message, before writing it. If the client was using a ConcurrentPublication, there could be multiple threads writing to the log buffer, so they bump the tail counter using a CAS operation. The tail counter is used to coordinate multiple writers.

Once the tail position has been advanced, the space is reserved and another thread can advance the tail position again to make room for another message. The actual writing of the messages happens next and can occur in parallel, so Publishers don't wait for each other to write messages; they just wait for the CAS operation.

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt isConnected: true activeTermCount: 0 initialTermId: 1005 TC1, TC2 TC0, Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: pub-pos, pub-lmt snd-pos, snd-lmt, snd-bpe

Once the Publication has bumped the tail counter by the length of the message (taking into account fragmentation, the DATA frame header, etc.), it writes each fragment in turn. For each fragment, it writes the frame header with a negative frame length, then the fragment body, then rewrites the length in the header with the positive frame length.

The consumer of the log buffer (in this case, the Sender) looks for a positive frame length as the signal that a frame is completely written and is ready to be read. The consumer cannot use the tail counter (or pub-pos), as that indicates how much space has been reserved, not whether the messages have been written yet.

The format of messages written into the log buffer is described in more detail on the Log Buffers page.

Blocked messages, offer() and tryClaim()

Why does the fragment length get written as a negative value first? Why not use zero?

The Publication API classes provide two methods that can be used by an application to publish a message:

  • offer() - the application writes a message to a buffer, then passes the buffer in to offer(), which copies it into the log buffer, requiring a copy operation
  • tryClaim() - the application passes the message length into tryClaim(), which reserves that amount of space in the log buffer, writes a Frame header with the negative length, then returns a BufferClaim. The BufferClaim wraps the area in the log buffer where the message needs writing. The application writes the message into the log buffer via the BufferClaim, then calls a commit() method, which rewrites the Frame length to be positive. This provides zero-copy semantics.

The risk with tryClaim() is that the application could fail to write the message and not call commit(), leaving it incomplete. This is known as a blocked message. If a message has been blocked for over 15 seconds, the Driver Conductor attempts to unblock it by replacing it with a PAD frame of the same length. This is what the negative length is used for. PAD frames are ignored by Subscriptions. If a message is unblocked, the UNBLOCKED_PUBLICATIONS system counter is incremented.

It is unlikely that offer() would fail in the same way, as the application writes the message to a temporary buffer before calling offer(). All offer() has to do is copy the bytes into the log buffer. However, offer() still writes the negative length in the header first, so both methods operate in the same way.

Note that with offer(), the Publication can fragment long messages. With tryClaim(), the onus is on the application not to write messages longer than MTU - Frame header length.

Sender Sends Data

When the Sender runs, it looks for Frames starting at snd-pos. If there is a Frame there, it will start with a Frame header and the first 32 bits will be the Frame length. The Sender reads the 32 bits at snd-pos and if there is a positive Frame length present, it skips forward that number of bytes (after aligning it to the 32 byte alignment) and repeats the process, like following a linked list of Frame lengths.

The Sender stops when it reaches a limit, which is either no more Frames (zero or negative Frame length), the end of the term, the sender limit (snd-lmt) or the max MTU length. If it found some Frames, it sends them in a single UDP packet, then updates snd-pos. Let's say it sent 2 of the 3 messages, because they wouldn't all fit in the MTU.

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt isConnected: true activeTermCount: 0 initialTermId: 1005 TC1, TC2 TC0, Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: pub-pos, pub-lmt snd-lmt, snd-bpe snd-pos,

The Sender always sends whole Frames, which means every packet will start with a Frame header. The Frame header acts as a network header - the Receiver can read the Frame header to find out which Image the packet is for and where to insert it in the Image.

If the Sender fails to send the complete packet, is increments a SHORT_SENDS system counter.

If there are no Frames to send, the Sender can send a heartbeat message so the Receiver knows it's still there.

Driver Conductor updates Publisher Counters

This is the same as before, but the Publication has written 3 messages and the Sender has sent 2 messages since the Driver Conductor last ran, so TC0 and snd-pos have advanced. The Driver Conductor sets pub-pos to TC0, and sets pub-lmt to snd-pos + termWindowLength.

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt isConnected: true activeTermCount: 0 initialTermId: 1005 TC1, TC2 TC0, Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: snd-lmt, snd-bpe snd-pos, pub-pos, pub-lmt termWindowLength

Sender receives another SM

At some point, the Receiver will send another SM containing a new min(sub-pos) and receiverWindowLength. The Sender uses it to advance snd-lmt again. Let's say there was a Subscriber that had only read one of the two messages. The slow Subscriber causes snd-lmt to not advance by as much, restricting how much can be sent.

TC2 TC1 TC0 pub-lmt pub-pos snd-pos snd-lmt isConnected: true activeTermCount: 0 initialTermId: 1005 TC1, TC2 TC0, Term 2 Term 1 Term 0 publications/23.logbuffer cnc.dat: pub-pos, pub-lmt snd-bpe snd-pos, snd-lmt, min(sub-pos) rcvWindowLength

And so on

The four significant events mentioned earlier continue to happen, and it's essentially just more of the same. The only action that hasn't been covered is cleaning (zeroing) of old messages in the log buffer.

Cleaning

The Driver Conductor cleans old messages in the buffer by overwriting them with zeros. This prepares the buffer for reuse when the Terms rotate and this Term is used again.

Network Publications clean up to snd-pos - termBufferLength. There is no need to clean up to a well-defined position like the start of a Frame, but this does mean if you ever have to look at a log buffer, the first non-zero bytes are unlikely to be a Frame header. Cleaning up to termBufferLength behind snd-pos allows retransmits of data up to a whole Term length in the past.

The Terms are not cleaned as per the "wear one, wash one, dry one" saying. That may have been the case once, but they are now cleaned with smaller, more frequent cleans. If you ever have to debug an issue and look at a log buffer file, this explains why a lot of the data will be zeros.