Skip to content

Cluster Counters

Creation of Cluster related Counters is split between the Consensus Module and Clustered Service.

Consensus Module

Unless stated otherwise, these are created in ConsensusModule.Context.conclude().

Counter Name Description
Cluster Errors
Consensus Module state
Cluster election state
Cluster election count
Cluster leadership term id
The current leadership term that the member is in. Updated in LEADER_INIT, when replaying a NewLeadershipTermEvent, on Election complete (for followers), and after loading a snapshot
Cluster node role
Cluster commit-pos
The Raft Commit position, which points to the end of the last committed message
Cluster control toggle
Set by ClusterTool to toggle the cluster into a different state, such as SUSPEND, RESUME or SNAPSHOT (this is how snapshots are triggered). States are defined in ToggleState
Node control toggle
Currently only used to replicate standby snapshots (an Aeron Premium feature)
Cluster snapshot count
Cluster timed out client count
Cluster standby snapshots received (if enabled)
Cluster max cycle time in ns
Updated by the DutyCycleStallTracker
Cluster work cycle time exceeded count: threshold=...ns
Updated by the DutyCycleStallTracker
Total max snapshot duration in ns
Total max snapshot duration exceeded count: threshold=...ns
Cluster recovery: leadershipTermId=...
Temporary RecoveryState Counter created in ConsensusModuleAgent.onStart() to pass recovery data to the Clustered Service(s)

Clustered Service

These are created in ClusteredServiceContainer.Context.conclude().

Counter Name Description
Cluster Container Errors - clusterId=... serviceId=...
Cluster container max cycle time in ns
Updated by the DutyCycleStallTracker
Cluster container work cycle time exceeded count: threshold=...ns
Updated by the DutyCycleStallTracker
Clustered service max snapshot duration in ns
Clustered service max snapshot duration exceeded count: threshold=...ns

Archive

Append Position

An important Counter in Aeron Cluster is the AppendPosition Counter. This is actually created by the Archive and is the RecordingPosition rec-pos Counter.

AeronStat

AeronStat is the tool for printing Counter values. There's a sample output below, for reference. The output contains the Counters for Aeron Transport, Archive and Cluster, displayed in creation order.

This was for a leader. You can tell because Counters 120 to 126 are Publication and Sender Counters for the Log streamId 100. Counters 126 and 127 are for the Archive receiving the Log.

Command line:

ADD_OPENS="--add-opens java.base/jdk.internal.misc=ALL-UNNAMED --add-opens java.base/java.util.zip=ALL-UNNAMED"
java -cp aeron-all/build/libs/aeron-all-*.jar -Daeron.dir=/Volumes/DevShm/node-0-driver ${ADD_OPENS} io.aeron.samples.AeronStat watch=false

todo: annotate this and check if anything's missing from the tables above

08:47:38 - Aeron Stat (CnC v0.2.0), pid 89674, heartbeat age 801ms
======================================================================
  0:               11,008 - Bytes sent
  1:                8,064 - Bytes received
  2:                    0 - Failed offers to ReceiverProxy
  3:                    0 - Failed offers to SenderProxy
  4:                    0 - Failed offers to DriverConductorProxy
  5:                    0 - NAKs sent
  6:                    0 - NAKs received
  7:                   60 - Status Messages sent
  8:                   92 - Status Messages received
  9:                   99 - Heartbeats sent
 10:                   72 - Heartbeats received
 11:                    0 - Retransmits sent
 12:                    0 - Flow control under runs
 13:                    0 - Flow control over runs
 14:                    0 - Invalid packets
 15:                    0 - Errors: version=1.47.0 commit=4374d82ead+guilty
 16:                    0 - Short sends
 17:                    0 - Failed attempts to free log buffers
 18:                    0 - Sender flow control limits, i.e. back-pressure events
 19:                    0 - Unblocked Publications
 20:                    0 - Unblocked Control Commands
 21:                    0 - Possible TTL Asymmetry
 22:                    0 - ControllableIdleStrategy status
 23:                    0 - Loss gap fills
 24:                    0 - Client liveness timeouts
 25:                    0 - Resolution changes: driverName=null
 26:            7,938,583 - Conductor max cycle time doing its work in ns: SHARED
 27:                    0 - Conductor work cycle exceeded threshold count: threshold=1000000000ns SHARED
 28:            7,834,250 - Sender max cycle time doing its work in ns: SHARED
 29:                    0 - Sender work cycle exceeded threshold count: threshold=1000000000ns SHARED
 30:            7,835,000 - Receiver max cycle time doing its work in ns: SHARED
 31:                    0 - Receiver work cycle exceeded threshold count: threshold=1000000000ns SHARED
 32:            2,310,416 - NameResolver max time in ns
 33:                    0 - NameResolver exceeded threshold count
 34:               77,568 - Aeron software: version=1.47.0 commit=4374d82ead+guilty
 35:           30,334,976 - Bytes currently mapped
 36:                    0 - Retransmitted bytes
 37:                    0 - Retransmit Pool Overflow count
 38:                    0 - Error Frames received
 39:                    0 - Error Frames sent
 40:    1,750,492,058,149 - client-heartbeat: id=1 name=archive version=1.47.0 commit=4374d82ead+guilty
 41:            6,532,875 - archive-conductor max cycle time in ns: SHARED - archiveId=1
 42:                    0 - archive-conductor work cycle time exceeded count: threshold=1000000000ns SHARED - archiveId=1
 43:                    1 - Archive Control Sessions - archiveId=1
 44:                    1 - Archive Recording Sessions - archiveId=1
 45:                    0 - Archive Replay Sessions - archiveId=1
 46:               39,584 - archive-recorder max write time in ns - archiveId=1
 47:                1,408 - archive-recorder total write bytes - archiveId=1
 48:              280,500 - archive-recorder total write time in ns - archiveId=1
 49:                    0 - archive-replayer max read time in ns - archiveId=1
 50:                    0 - archive-replayer total read bytes - archiveId=1
 51:                    0 - archive-replayer total read time in ns - archiveId=1
 52:                    1 - rcv-channel: aeron:udp?term-length=65536|sparse=true|endpoint=localhost:9001 127.0.0.1:9001
 53:                    1 - rcv-local-sockaddr: 52 127.0.0.1:9001
 54:    1,750,492,058,193 - client-heartbeat: id=17 name=consensus-module-0-0 version=1.47.0 commit=4374d82ead+guilty
 55:                    0 - Cluster Errors - clusterId=0 version=1.47.0 commit=4374d82ead+guilty
 56:                    1 - Consensus Module state - clusterId=0
 57:                   17 - Cluster election state - clusterId=0
 58:                    1 - Cluster election count - clusterId=0
 59:                    0 - Cluster leadership term id - clusterId=0
 60:                    2 - Cluster node role - clusterId=0
 61:                1,408 - Cluster commit-pos: - clusterId=0
 62:                    1 - Cluster control toggle - clusterId=0
 63:                    1 - Node control toggle - clusterId=0
 64:                    0 - Cluster snapshot count - clusterId=0
 65:                    0 - Cluster timed out client count - clusterId=0
 66:           31,000,000 - Cluster max cycle time in ns - clusterId=0
 67:                    0 - Cluster work cycle time exceeded count: threshold=1000000000ns - clusterId=0
 68:                    0 - Total max snapshot duration in ns - clusterId=0
 69:                    0 - Total max snapshot duration exceeded count: threshold=1000000000ns - clusterId=0
 70:                    1 - rcv-channel: aeron:udp?endpoint=localhost:9003|term-length=64k 127.0.0.1:9003
 71:                    1 - rcv-local-sockaddr: 70 127.0.0.1:9003
 72:                  160 - pub-pos (sampled): 35 -1167600074 104 aeron:ipc?term-length=128k
 73:               65,536 - pub-lmt: 35 -1167600074 104 aeron:ipc?term-length=128k
 74:    1,750,492,058,301 - client-heartbeat: id=37 name=clustered-service-0-0 version=1.47.0 commit=4374d82ead+guilty
 75:                    0 - Cluster Container Errors - clusterId=0 serviceId=0 version=1.47.0 commit=4374d82ead+guilty
 76:           25,232,833 - Cluster container max cycle time in ns - clusterId=0 serviceId=0
 77:                    0 - Cluster container work cycle time exceeded count: threshold=1000000000ns - clusterId=0 serviceId=0
 78:                    0 - Clustered service max snapshot duration in ns - clusterId=0 serviceId=0
 79:                    0 - Clustered service max snapshot duration exceeded count: threshold=1000000000 - clusterId=0 serviceId=0
 80:                  512 - pub-pos (sampled): 43 1581412165 10 aeron:ipc?session-id=1581412165|term-length=64k|sparse=true|mtu=1408|alias=cm-archive-ctrl-req-cluster-0
 81:               32,768 - pub-lmt: 43 1581412165 10 aeron:ipc?session-id=1581412165|term-length=64k|sparse=true|mtu=1408|alias=cm-archive-ctrl-req-cluster-0
 82:                  512 - sub-pos: 16 1581412165 10 aeron:ipc?term-length=64k @0
 83:                  192 - pub-pos (sampled): 44 -1167600073 105 aeron:ipc?term-length=128k
 84:               65,536 - pub-lmt: 44 -1167600073 105 aeron:ipc?term-length=128k
 85:                  192 - sub-pos: 34 -1167600073 105 aeron:ipc?term-length=128k @0
 86:                  160 - sub-pos: 46 -1167600074 104 aeron:ipc?term-length=128k @0
 87:                  960 - pub-pos (sampled): 49 1581412165 20 aeron:ipc?mtu=1408|term-length=65536|session-id=1581412165|alias=cm-archive-ctrl-resp-cluster-0|sparse=true
 88:               32,768 - pub-lmt: 49 1581412165 20 aeron:ipc?mtu=1408|term-length=65536|session-id=1581412165|alias=cm-archive-ctrl-resp-cluster-0|sparse=true
 89:                  960 - sub-pos: 36 1581412165 20 aeron:ipc?session-id=1581412165|term-length=64k|sparse=true|mtu=1408|alias=cm-archive-ctrl-resp-cluster-0 @0
 95:                  672 - sub-pos: 76 -2066571664 101 aeron:udp?endpoint=localhost:9002|term-length=64k @0
 96:                  672 - rcv-hwm: 77 -2066571664 101 aeron:udp?endpoint=localhost:9002|term-length=64k
 97:                    1 - snd-channel: aeron:udp?endpoint=localhost:9103|term-length=64k 127.0.0.1:59392
 98:                    1 - snd-local-sockaddr: 97 127.0.0.1:59392
 99:                2,592 - pub-pos (sampled): 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
100:               35,360 - pub-lmt: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
101:                2,592 - snd-pos: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
102:               32,768 - snd-lmt: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
103:                    0 - snd-bpe: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
104:                2,080 - sub-pos: 33 659118749 108 aeron:udp?endpoint=localhost:9003|term-length=64k @0
105:                2,080 - rcv-hwm: 63 659118749 108 aeron:udp?endpoint=localhost:9003|term-length=64k
106:                2,080 - rcv-pos: 63 659118749 108 aeron:udp?endpoint=localhost:9003|term-length=64k
107:                2,080 - sub-pos: 33 -1307509005 108 aeron:udp?endpoint=localhost:9003|term-length=64k @0
108:                2,080 - rcv-hwm: 64 -1307509005 108 aeron:udp?endpoint=localhost:9003|term-length=64k
109:                2,080 - rcv-pos: 64 -1307509005 108 aeron:udp?endpoint=localhost:9003|term-length=64k
110:                    1 - snd-channel: aeron:udp?endpoint=localhost:9203|term-length=64k 127.0.0.1:60072
111:                    1 - snd-local-sockaddr: 110 127.0.0.1:60072
112:                2,592 - pub-pos (sampled): 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
113:               35,360 - pub-lmt: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
114:                2,592 - snd-pos: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
115:               32,768 - snd-lmt: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
116:                    0 - snd-bpe: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
117:                    1 - snd-channel: aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log 0.0.0.0:58020
118:                    1 - snd-local-sockaddr: 117 0.0.0.0:58020
119:                    2 - mdc-num-dest: aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log 
120:                    2 - fc-receivers: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
121:                1,408 - pub-pos (sampled): 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
122:            1,049,984 - pub-lmt: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
123:                1,408 - snd-pos: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
124:              131,072 - snd-lmt: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
125:                    0 - snd-bpe: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
126:                1,408 - sub-pos: 72 -1167600070 100 aeron-spy:aeron:udp?tags=67|session-id=-1167600070|alias=log @0
127:                1,408 - rec-pos: 0 -1167600070 100 aeron:udp?tags=67|session-id=-1167600070|alias=log - archiveId=1
128:                1,408 - sub-pos: 74 -1167600070 100 aeron-spy:aeron:udp?tags=67|session-id=-1167600070|alias=log-sc-0 @0
129:                    1 - rcv-channel: aeron:udp?endpoint=localhost:9002|term-length=64k 127.0.0.1:9002
130:                    1 - rcv-local-sockaddr: 129 127.0.0.1:9002
131:                  672 - rcv-pos: 77 -2066571664 101 aeron:udp?endpoint=localhost:9002|term-length=64k
132:                    1 - snd-channel: aeron:udp?endpoint=localhost:50779 127.0.0.1:65441
133:                    1 - snd-local-sockaddr: 132 127.0.0.1:65441
134:                  576 - pub-pos (sampled): 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
135:              524,864 - pub-lmt: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
136:                  576 - snd-pos: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
137:              131,072 - snd-lmt: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
138:                    0 - snd-bpe: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
139:                  768 - sub-pos: 76 1927135809 101 aeron:udp?endpoint=localhost:9002|term-length=64k @0
140:                  768 - rcv-hwm: 79 1927135809 101 aeron:udp?endpoint=localhost:9002|term-length=64k
141:                  768 - rcv-pos: 79 1927135809 101 aeron:udp?endpoint=localhost:9002|term-length=64k
142:                    1 - snd-channel: aeron:udp?endpoint=localhost:55122 127.0.0.1:63602
143:                    1 - snd-local-sockaddr: 142 127.0.0.1:63602
144:                  672 - pub-pos (sampled): 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
145:              524,960 - pub-lmt: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
146:                  672 - snd-pos: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
147:              131,072 - snd-lmt: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
148:                    0 - snd-bpe: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
--