Aeron Transport - Detailed Overview
We said in the high-level Aeron Transport Overview that Aeron components communicate with each other via shared memory. Let's zoom in on the overview diagram and take a closer look at some actual (shared memory) files used by Aeron.
There are two kinds of Publication: Network and IPC. These control what happens to messages after they are written into a Publication log buffer. These live in the Media Driver. In the diagram below, the Publications that exist in the Aeron Client are client-side Publication API objects. These exist purely to allow the application to write messages into the Publication log buffer in the first place. There are actually two kinds of those too: Concurrent and Exclusive, which are to do with multi-threaded writing to the log buffer (we'll come back to this later).
Network Publications
The following diagram is for sending messages across machines, which use Network Publications. The diagram shows all the components on one machine. To show both machines, the diagram would have to be twice the size (as per the overview diagram), so it's been collapsed for conciseness. The steps before the message is sent on the network are on the sending machine and the steps after are on the receiving machine.
Use the tabs above to step through the animation.
Sending machine (application thread): the sending application sends a message by writing it to a shared memory buffer that has been dedicated for it to send messages to the receiving application. This is a Publication log buffer, which would have been created earlier, when the application asked Aeron (specifically, the Client Conductor) to create the Publication.
Sending machine (Sender thread): as this is a Network Publication, when it was created, the Sender component within the Media Driver would have been notified. The Sender monitors the Publication log buffer for new messages and sends them over the network to the receiving machine.
Receiving machine (Receiver thread): when the Subscription was created, the Receiver component within the Media Driver would have been notified. The Receiver polls the network for new UDP packets and reads them into a shared memory buffer known as an Image log buffer. The Image log buffer becomes a replica of the Publication log buffer on the sending machine.
Receiving machine (application thread): the receiving application polls the Image log buffer for new messages and reads the message. The message has now been delivered.
IPC Publications
IPC Publications are used to send messages between processes on the same machine. Here, the steps are simpler. The first step is the same, but the Sender and Receiver are not required because there is no need to replicate the Publication log buffer over the network to the receiving machine. The Subscription reads directly from the same Publication log buffer that the Publication writes to.
Use the tabs above to step through the animation.
The sending application sends a message by writing it to a shared memory buffer that has been dedicated for it to send messages to the receiving application. This is a Publication log buffer, which would have been created earlier, when the application asked Aeron (specifically, the Client Conductor) to create the Publication.
The receiving application is on the same machine, so it polls the Publication log buffer directly, and reads the message. The message has now been delivered.
Let's look at some of these components in more detail.
Client API
The Aeron Client API is Java code the application uses to interact with Aeron. The Client API contains the Client Conductor, which manages resources within the Client API and interacts with the Media Driver on behalf of the application. When Aeron starts, there are no Publication or Subscription API objects. The application calls into the Client API to create them, which the Client Conductor initiates by sending a message to the Media Driver. The application can create many Publications and Subscriptions, depending on how many different places it needs to send / receive messages.
The Publication and Subscription API objects are used by the application to interact with the underlying shared memory buffers for sending and receiving messages. The sending application writes messages to the Publication log buffer by calling methods on a Publication API object. The receiving application reads messages from the Publication / Image log buffer by calling methods on a Subscription API object.
cnc.dat
The cnc.dat
file is the Command And Control file for the Media Driver and is central to Aeron's operation. It
contains several sections. One section provides the public interface for sending commands to the Media Driver. Another
section is for responses to those commands. Yet another section contains various counters that the Media Driver writes
to, exposing information about its internal state.
Example
An example of a command is when an application wants to create a Publication. The application calls into the Aeron
Client API, which causes the Client Conductor to write a command (message) into the cnc.dat
file. When the Media
Driver (specifically the Driver Conductor) next polls the command buffer within the cnc.dat
file, it sees the
message, processes it and writes a response message into the cnc.dat
file, which the application receives.
Examples of the counters in the cnc.dat
file are those for Publications and Subscriptions, of which there are
several. For a Publication, there is a counter for the position in the Publication log buffer that the application
has written messages up to. For a Subscription, there is a counter for the position in the log buffer that messages
can be read up to. There are other counters for tracking the progress of the Sender and Receiver. If you are using
Aeron Archive or Aeron Cluster, they add their own set of counters. Each counter is a 64-bit number.
Updating counters in shared memory is very efficient - the Media Driver just writes the value to a memory location.
One advantage of it being accessible via /dev/shm
is that monitoring tools can open the same shared memory file and
read the counters, which are updated in real-time, with no performance impact on the Media Driver. The Media Driver
doesn't need a separate publishing mechanism to expose them - they are accessible for free.
Publication log buffer files
The Publication log buffer files are where the application writes messages that it wants to send to a recipient. When a Publication is created, a log buffer file is created for it. There can be many Publications and Publication log buffer files, each for a different stream of messages.
Sender
There is only one Sender, which is created when the Media Driver is created. It polls each Network Publication log buffer and sends any new messages via UDP to the Receiver on the receiving machine. It is not used for IPC Publications.
Receiver
The Receiver is the Sender's counterparty. There is only one Receiver, which is created when the Media Driver is created. The Receiver is only used for Network Publications. It receives UDP packets from the Sender, finds the correct Image log buffer and inserts them into the correct place, creating a replica of the Sender's Publication log buffer. The Receiver handles messages arriving out of order, sending NAKs for missed messages, etc.
Image log buffer files
Image log buffer files are only required for Network Publications. When the Receiver receives messages from the Sender, it assembles them in the correct order in an Image log buffer file. When it is up-to-date, the Image log buffer file will be identical to the Publication log buffer file on the sending machine (the data part will be identical, but there will be a few differences in the metadata within the file).
Driver Conductor
Like the Client Conductor in the Client API, the Driver Conductor manages the resources in the Media Driver. The
Driver Conductor has a list of jobs that it loops around (its duty cycle). One of those it to check for and process
any new commands sent to the Media Driver from the Client Conductor in the cnc.dat
file.
Another job is to detect message loss in the Image log buffer files that the Receiver is assembling. If it detects loss
that hasn't been resolved within a certain period, it tells the Receiver to send a NAK to the Sender, then records the
loss in the loss-report.dat
file.
loss-report.dat
The Driver Conductor records any message loss for Network Publications in the loss-report.dat
file. This is an output
file from Aeron - it is for operational monitoring and isn't read by Aeron.
Example files
Given that the shared memory buffers are mounted into the Linux filesystem, we can list them simply by listing the files
in a subdirectory of /dev/shm
. Here's an example set of files, containing multiple Publications and Image log buffers.
Some of the Publication log buffers could be for IPC Publications and some for Network Publications.
cnc.dat
publications/18.logbuffer
publications/20.logbuffer
publications/22.logbuffer
publications/26.logbuffer
publications/33.logbuffer
publications/36.logbuffer
publications/38.logbuffer
publications/44.logbuffer
publications/52.logbuffer
publications/53.logbuffer
publications/73.logbuffer
publications/75.logbuffer
images/61.logbuffer
images/55.logbuffer
images/56.logbuffer
loss-report.dat