Message Handling
When the ConsensusModuleAgent receives a message on the Consensus channel from another member, if there is an Election, it forwards the message to the Election object.
This section describes how each of those messages is handled. The handling is described separate from the election state descriptions, because some messages can be received in several states, or even when not expected.
The handlers are all described below:
- onCanvasPosition
- onRequestVote
- onVote
- onNewLeadershipTerm
- onAppendPosition
- onCommitPosition
- onReplayNewLeadershipTermEvent
- onCatchupPosition
- onStopCatchup
onCanvasPosition¶
Election.onCanvassPosition()
, which is passed parameters from a CanvassPosition.
When a member receives a CanvassPosition from another member, it:
- records the other member's
logLeadershipTermId
andlogPosition
in its ClusterMember object - if the current member is the leader and the other member's
logLeadershipTermId
is less than the current member'sleadershipTermId
, the current member sends out a NewLeadershipTerm message to the other member - if the other member's
logLeadershipTermId
is greater than the current member'sleadershipTermId
and the current member is in the LEADER_LOG_REPLICATION or LEADER_READY state, an exception is thrown as this means a new election has been started during the current election
onRequestVote¶
Election.onRequestVote()
, which is passed parameters from a RequestVote.
When a member receives a RequestVote message...
If the candidateTermId in the message is less than or equal to its own candidateTermId, then:
- the member returns a negative Vote message
otherwise, if this member has more recent entries in its Log than those in the request (this member's leadership term is higher, or it's equal and it's log position is higher):
- the member returns a negative Vote message (it won't vote for a member that doesn't have the latest entries in its Log)
- if this member is the leader, then it publishes a NewLeadershipTerm message to the sender, which should then become a follower
otherwise, if the current election state is CANVASS, NOMINATE, CANDIDATE_BALLOT or FOLLOWER_BALLOT, then:
- it updates the candidateTermId in its node-state.dat file with the one from the message
- returns a positive Vote message to the sender
- moves to FOLLOWER_BALLOT
onVote¶
Election.onVote()
, which is passed parameters from a Vote.
If all the following are true:
- the member is in CANDIDATE_BALLOT (waiting for votes)
- the candidateTermId in the message matches the current member's (the vote is in the same leadership term as this member's RequestVote)
- the candidateMemberId in the message (the member being voted for) is the current member
then it records the vote (positive or negative) in the ClusterMember object for the sender.
onNewLeadershipTerm¶
Election.onNewLeadershipTerm()
, which is passed parameters from a NewLeadershipTerm.
The message is only processed if the member is either:
- in FOLLOWER_BALLOT or CANDIDATE_BALLOT and it's for the same leadership term, i.e. it was waiting on the outcome of a vote, which the leader is now announcing
- it's in CANVASS, i.e. it probably just started, so this message tells it which member is the leader, and information about the current leadership term
The NewLeadershipTerm message contains a logLeadershipTermId field. This is the leadership term of the last Log entry in the leader's Log Recording. Some of the other fields in the message relate to that leadership term, so it needs to match the follower's logLeadershipTermId. If it doesn't, the follower returns to the CANVASS state, where it will send another CanvassPosition to the leader. That contains the follower's logLeadershipTermId, which the leader will reply to specifically with another NewLeadershipTerm with a matching logLeadershipTermId. The new message tells the follower about their next leadership term, and the current leadership term.
After performing the above checks (and a few more), the follower proceeds with processing the message...
todo: wall of text needs a diagram
The message contains a nextTermBaseLogPosition, which is the start of the next leadership term after the one referred to by logLeadershipTermId. The next leadership term should be empty on the follower. If it is not, it will contain uncommitted entries (see Raft - Other Log Examples, figures 7d and 7f, for examples of how this might occur). The uncommitted entries conflict with entries in the leader's Log and need removing. The follower goes into onTruncateLogEntry(), which truncates the Log and throws an Exception. This is caught in the ConsensusModuleAgent's duty cycle, which calls handleError() on the Election object. That moves the Election back to the INIT state, where it will start again with a truncated Log.
Next, the follower records a lot of values from the message. An important one is catchupJoinPosition. If the logPosition in the message (on the leader) is higher than the follower's appendPosition, this means the leader has entries in its Log (in the current leadership term, as logLeadershipTermId matches) that are not on the follower. This will be because the leader has already started appending messages to the Log, without this follower's involvement. The follower needs to perform Catchup to back-fill its Log with the missing entries from the current leadership term. catchupJoinPosition is set to logPosition, and will be used later to check whether Catchup is required, and if so, as a test to see whether the follower has nearly caught up.
Next, the follower checks whether its appendPosition is before the start of the current leadership term (termBaseLogPosition in the message). If it is, it needs to perform Replication to back-fill missing Log entries from one or more leadership terms before the current one. It either needs to replicate the rest of its current leadership term, if it stopped part-way through, or it needs to replicate the next leadership term, if it stopped at the end (replication is done one leadership term at a time). The follower records the appropriate log positions for replication, then moves to FOLLOWER_LOG_REPLICATION.
However, if the follower's appendPosition is at the start of the current leadership term, then Replication is not required. The follower moves to FOLLOWER_REPLAY.
onAppendPosition¶
Election.onAppendPosition()
,
which is passed parameters from an AppendPosition.
There is an assumption here that only followers will send an AppendPosition message to the new leader.
Ignored if the member is in the INIT state.
If the leadershipTermId in the message <= this member's leadershipTermId, then it records the sender's leadershipTermId
and appendPosition in its ClusterMember. It then calls consensusModuleAgent.trackCatchupCompletion()
to ...
- if the follower has a catchup replay session
- and the follower's logPosition matches the leader's
- and the follower has a catchup replay correlation id
- then the current member (leader) asks its Archive to stop replaying to the follower's catchup replay session
- it then sends a StopCatchup message to the follower
- when the follower receives the StopCatchup message, it removes its catchupLogDestination from the logAdapter
onCommitPosition¶
Election.onCommitPosition()
,
which is passed parameters from a CommitPosition.
Sent from the leader to the follower members.
Ignored if the member is in the INIT state.
-
if the member is in FOLLOWER_CATCHUP
- and the message is for the same leadershipTermId
- and the follower has a catchupJoinPosition
- and the message is from the expected leader
- then it updates catchupCommitPosition to the logPosition
-
if the member is in FOLLOWER_LOG_REPLICATION
- and the message is from the expected leader
- then replicationCommitPosition is set to the logPosition (if it is higher)
- and the replicationDeadlineNs is reset to now plus leaderHeartbeatTimeoutNs (10 seconds)
-
if the member is in LEADER_READY
- and the leadershipTermId is higher than what the member was expecting
- then and exception is thrown because a new leader has been detected
onReplayNewLeadershipTermEvent¶
Election.onReplayNewLeadershipTermEvent()
which is passed parameters from a NewLeadershipTermEvent.
Note that there are 2 variants of the 'new leadership term' message:
- NewLeadershipTerm - Consensus message sent between members during an Election (or outside an Election if a member starts up while the others are in a leadership term)
- NewLeadershipTermEvent - Log message, written at the end of an Election in LEADER_READY
This message handler is for the Log version. The handler is invoked during an Election when the Log is being replayed. It indicates that there was an Election at that point in the Log, so a new leadership term is starting.
During an Election, the message is only processed in the FOLLOWER_CATCHUP and FOLLOWER_REPLAY states.
- it add the new leadership term to the recording log
- it remembers the
logPosition
and sets itslogLeadershipTermId
toleadershipTermId
in the message
onCatchupPosition¶
ConsensusModuleAgent.onCatchupPosition()
which is passed parameters from a CatchupPosition. Note that this message is handled in the ConsensusModuleAgent,
rather than the Election object, because it needs to be handled by the leader when it is not in an Election.
When the leader receives the CatchupPosition, it looks up the ClusterMember for the follower and checks that it's not already replaying.
Then it asks its Archive to start a Replay session, replaying the Log Recording to the Catchup endpoint in the message, from the logPosition in the message.
It then stores its sessionId in the ClusterMember object for later use.
onStopCatchup¶
ConsensusModuleAgent.onStopCatchup()
which is passed parameters from a StopCatchup. Note that this message is handled in the ConsensusModuleAgent,
rather than the Election object. It doesn't need to interact with the Election directly.
StopCatchup is sent from the leader to a follower to tell the follower it has caught up to the live Log position. The follower removes its Catchup endpoint from its Log Subscription.