Canvass
CANVASS - exchange append positions with other members, then assess whether this member is eligible to become a candidate
On Entry¶
Enters from INIT if it's in a multi node cluster, or from another state if it needs to restart the Election.
Description¶
While in this state, the member publishes a CanvassPosition message every 100 ms to the other members, which tells them:
- the member is online
- it's in the CANVASS state
- how much it has in its Log Recording - the leadership term and position of its last entry
The interesting fields in the message are:
logPosition
- the position of the last entry in the Log Recording (the append position)logLeadershipTermId
- the leadership term of the last entry in the Log RecordingleadershipTermId
- the current leadership term of the sender (which isn't actually used - what matters is the state on disk)
From a behavioural point of view, there are two interesting scenarios for when a member is in the CANVASS state. The first is when the whole cluster starts and all members enter the CANVASS state. The second is when a member starts (or restarts) while the other members are already running (they may be in the same leadership term or a newer one). There are other ways of entering the CANVASS state, but these two cover the differences in behaviour.
Scenario 1: the whole cluster starts¶
If all cluster members start at about the same time, they will all quickly move to the CANVASS state and effectively
wait for each other. They exchange CanvassPosition messages with each other and each member records the
logLeadershipTermId
and logPosition
of the others, building a picture of the append positions across the cluster.
Each member evaluates whether it is a candidate to become leader. To be a candidate, it needs the highest / joint
highest logPosition
, in the highest / joint highest logLeadershipTermId
. In short, the next leader needs to be
voted in by a majority of the cluster members, and any member on a higher leadership term, or higher log position
within that term, won't vote for this member. This is Raft's election restriction,
which ensures that no Committed entries are lost. There can be more than one candidate if some members have joint
highest values. The evaluation works as follows:
- if the member has heard from all the others, and it has the highest
logLeadershipTermId
andlogPosition
, it is considered a unanimous candidate and proceeds to the next stage immediately (there can be more than one unanimous candidate) - if it has not heard from all the others (e.g. a member is down), then after a timeout, if it has heard from enough
to reach quorum (a majority), then it becomes a quorum candidate.
In the first CANVASS state after startup, the timeout is 60 seconds since entering the CANVASS state
(
aeron.cluster.startup.canvass.timeout
). At all other times, the timeout is 10 seconds since the last message (NewLeadershipTerm or CommitPosition) from the leader (aeron.cluster.leader.heartbeat.timeout
). The additional time in the first CANVASS state gives members time to start up
If a member considers itself a candidate, it generates a random 'nomination deadline', which is up to
half of the Election timeout of 1 second (aeron.cluster.election.timeout
), so a deadline of up to 500 ms. It then
moves immediately to the NOMINATE state. The deadline will be explained within NOMINATE.
Scenario 2: one member restarts¶
When one member restarts, it doesn't know the state of the other members, so it doesn't know they are still running.
It goes through the motions of creating an Election object and moving to CANVASS as per usual. It sends a
CanvassPosition message to the others, but this time, the leader replies with a NewLeadershipTerm,
telling the member that there is a leader in a given leadershipTermId
at a current logPosition
.
What happens next depends on how the values in the message compare with those on the restarted member. A NewLeadershipTerm message needs to be handled from a number of states, not just CANVASS, so see onNewLeadershipTerm for details of how that is handled.
On Exit¶
On exit from CANVASS, the member will have interacted with other members and:
- it decided it is a candidate to become leader, so it created a nomination deadline and moved to NOMINATE
- it received a RequestVote from another member, voted for them and moved to FOLLOWER_BALLOT
- it received a general NewLeadershipTerm from the existing leader (sent to all members), telling it about the leader's current logLeadershipTermId, but the member needs to know about its own logLeadershipTermId, so it moved back to CANVASS, where it will send another CanvassPosition, which the new leader will reply to directly with another NewLeadershipTerm for this member (see next item)
- it received a direct NewLeadershipTerm from the existing leader, specifically for this member. It tells this member about its next leadership term, so it moved to one of FOLLOWER_LOG_REPLICATION or FOLLOWER_REPLAY
todo: the above transitions are correct, but I'm not sure about my reasoning for the one that moves back to CANVASS