Skip to content

Follower Catchup

FOLLOWER_CATCHUP - catch up to the leader on the Catchup endpoint until the live Log endpoint can be added and merged

On Entry

Enters from FOLLOWER_CATCHUP_AWAIT when it is ready to receive 'catchup' Log messages from the leader's Archive. These will be received in the follower's Log log buffer, where they will be processed like normal live Log messages.

Description

Catchup is controlled by two variables:

  • catchupJoinPosition - this is set to the logPosition value in the last NewLeadershipTerm message, handled in onNewLeadershipTerm. This would have been received before FOLLOWER_REPLAY was entered, and is the leader's Log position at that time. This position is ahead of the start of the current leadership term, which is what triggered Catchup in the first place
  • catchupCommitPosition - while the follower is in this state, whenever it receives a CommitPosition message from the leader, this is set to the logPosition from the message

This state starts by doing a catchupPoll() up to catchupCommitPosition, which does the following:

  • it polls the LogAdapter, which polls the Log Subscription, as it does for live Log messages. It polls it up to the minimum of the AppendPosition (how much it's received from the leader's Archive and written to its own Log Recording) and catchupCommitPosition. Polling the Log causes the ConsensusModuleAgent to process the Log messages (it just tracks client sessions)
  • it sends an AppendPosition message to the leader, with an APPEND_POSITION_FLAG_CATCHUP flag set. The leader uses this as a guard for checking whether Catchup has completed
  • it sets its commit-pos Counter, which the Clustered Service will use as a limit when polling the Log Subscription (remember that the Clustered Service will also be processing the Log in its own thread)

After catchupPoll(), it checks whether the position it has read up to in the Log is 'near' the catchupCommitPosition, where near is defined as 1/4 of a term length in the Log log buffer. If it is, it adds the live Log destination to the Log Subscription. This just makes the Receiver listen on the Log endpoint. It does not need to inform the leader as the leader's Log Publication automatically adds all followers as destinations when it was created, so it will automatically connect. todo: check this. this must be how it works, but I haven't verified

Finally, it checks whether its commit-pos Counter has reached, or passed both the catchupJoinPosition and catchupCommitPosition (this one is a moving target), and that the Catchup endpoint has been removed. The Catchup endpoint is removed when the follower receives a StopCatchup message from the leader, described below (see also onStopCatchup). Once all of this is true, it moves to FOLLOWER_LOG_INIT.

Leader

On the leader, when it receives the AppendPosition message from above, it updates the logPosition in the ClusterMember for the follower, then calls trackCatchupCompletion(). That checks if the follower is performing Catchup and whether the AppendPosition contains the APPEND_POSITION_FLAG_CATCHUP flag. If it does, it checks whether the follower's logPosition has reached the logPosition written by the LogPublisher. If it has, it tells its Archive to stop the Replay session and sends a StopCatchup message to the follower.

On Exit

Catchup is complete. The follower received enough Log messages via its Catchup endpoint that it reached the live Log position. It has removed the Catchup endpoint from its Log Subscription and is receiving live Log messages via the Log endpoint, as it normally would. The Archive Replay on the leader has been stopped.

The member moves to FOLLOWER_LOG_INIT.