Aeron Archive - Detailed Overview
Let's say you've created a simple trading application. On one machine, you have a Market Data Collector, which collects market data from various sources, converts it into a normalised form and publishes it via Aeron Transport. On another machine, you run a Trading Strategy that subscribes to the market data and decides how to trade.
This is all regular Aeron Transport. Market Data Collector uses a Publication API object to write messages to a Publication log buffer. The Sender sends these messages to Machine 2. On machine 2, the Receiver receives them and writes them to an Image log buffer, creating a copy of the Publication log buffer. The Trading Strategy has a Subscription API object, which it uses to read messages from the Image.
Aeron Archive
Aeron Archive adds the ability to record a Publication, so a permanent copy of the messages is stored on disk, and the ability to replay those messages later. Being able to replay the market data from our market data feed would enable several things, such as:
- catchup - if you restart or deploy a new version of the Trading Strategy while the Market Data Collector continues running, you could replay live data that was missed while the Trading Strategy was down. Once up to date, the Trading Strategy could rejoin the live stream (this is known as replay merge)
- issue investigation - the ability to replay live data in a development environment, to reproduce and step through a scenario that the live Trading Strategy went through
- A/B testing - the ability to test a new version of the Trading Strategy against the same market data that the live Trading Strategy processed, to evaluate whether the new version performs better
Aeron Archive replays a recorded Publication by publishing it over a new Publication, which is referred to as a Replay Publication. The receiving application receives the data by Subscribing to the replay Channel and StreamId, rather than the live Channel and StreamId.
Aeron Archive can record a Publication at the publishing end, or at the receiving end. In our example, this means it would need to run on Machine 1 or Machine 2. It cannot run on another machine, because it would not have access to the messages.
Aeron Archive application
Aeron Archive is an application that uses Aeron Transport to communicate. In that respect, it is no different to any application that you might write that uses Aeron Transport and does not have any privileged access to Aeron Transport internals.
It listens on a Control Request Channel (a normal Subscription) that other applications can publish requests to, such as 'start recording' or 'replay recording'. To record a Publication, it creates a Subscription to it, in the same way that your application might create a Subscription. For example, in the trading application above, Aeron Archive could run next to the Trading Strategy and create a Subscription to the live market data channel, in exactly the same way that the Trading Strategy does. It could then save messages received on the Subscription to disk.
Aeron Archive on the Receiving End
Recording an IPC Publication and recording on the receiving end of a Network Publication both work in the same way, which looks like this:
Use the tabs above to step through the animation.
The Aeron Archive application runs on the same machine as an IPC Publication, or on the machine on the receiving end of a Network Publication. In our trading application, it would run on Machine 2. You can see from the diagram that it contains an Aeron Transport Client API to use Aeron Transport, just like the Trading Strategy.
Something sends a message to Aeron Archive on the Control Request Channel, asking it to record the Publication (not shown). That something could be the Trading Strategy, for example. Aeron Archive subscribes to the Publication, just like the Trading Strategy. This means its Subscription reads from the same Image log buffer that the Trading Strategy's Subscription reads from.
Any messages it receives are written to a Recording.
If the Trading Strategy wanted to replay messages from the Recording, it would create a Subscription to a Channel (let's call it the Replay Market Data Channel, which would likely be an IPC Channel), then instruct Aeron Archive (on the Control Request Channel, not shown) to publish the messages to that Channel. Aeron Archive would read messages from the Recording and publish them. The Trading Strategy would then have a Subscription for the live data and a Subscription for the replay data.
Aeron Archive on the Sending End
Recording on the sending end of a Network Publication is very similar, but it uses a Spy Subscription. Normally, the only thing consuming messages from a Network Publication's Log Buffer is the Sender, which sends them over the network. A Spy Subscription can be used to 'spy' on the messages as they pass through the Publication Log Buffer. It looks like this:
Use the tabs above to step through the animation.
Aeron Archive runs on Machine 1.
Something sends a message to Aeron Archive on the Control Request Channel, asking it to record the Publication (not shown). That something could be the Market Data Collector, for example. Aeron Archive subscribes to the market data Publication using a Spy Subscription. This behaves like an IPC subscription in terms of receiving messages from the Publication log buffer.
When the Market Data Collector publishes a message, it writes it to the Publication log buffer. The Sender watches for new messages and sends them to the receiving machine. In addition, Aeron Archive polls the Spy Subscription and receives the same messages, which it writes to a Recording. The Sender and the Spy Subscription run independently and have their own counters for their position in the log buffer.
If the Trading Strategy wanted to replay messages from the Recording, it would create a Subscription to a Channel (let's call it the Replay Market Data Channel, which would have to be a UDP Channel), then instruct Aeron Archive (on the Control Request Channel, not shown) to publish the messages to that Channel. Aeron Archive would read messages from the Recording and publish them. The Trading Strategy would then have a Subscription for the live data and a Subscription for the replay data.