Cluster Counters¶
Creation of Cluster related Counters is split between the Consensus Module and Clustered Service.
Consensus Module¶
Unless stated otherwise, these are created in ConsensusModule.Context.conclude()
.
Counter Name | Description |
---|---|
Cluster Errors | |
Consensus Module state | |
Cluster election state | |
Cluster election count | |
Cluster leadership term id |
The current leadership term that the member is in. Updated in LEADER_INIT, when replaying a NewLeadershipTermEvent, on Election complete (for followers), and after loading a snapshot |
Cluster node role | |
Cluster commit-pos |
The Raft Commit position, which points to the end of the last committed message |
Cluster control toggle |
Set by ClusterTool to toggle the cluster into a different state, such as SUSPEND, RESUME or SNAPSHOT (this is how snapshots are triggered). States are defined in ToggleState |
Node control toggle |
Currently only used to replicate standby snapshots (an Aeron Premium feature) |
Cluster snapshot count | |
Cluster timed out client count | |
Cluster standby snapshots received (if enabled) | |
Cluster max cycle time in ns |
Updated by the DutyCycleStallTracker |
Cluster work cycle time exceeded count: threshold=...ns |
Updated by the DutyCycleStallTracker |
Total max snapshot duration in ns | |
Total max snapshot duration exceeded count: threshold=...ns | |
Cluster recovery: leadershipTermId=... |
Temporary RecoveryState Counter created in ConsensusModuleAgent.onStart() to pass recovery data to the Clustered Service(s) |
Clustered Service¶
These are created in ClusteredServiceContainer.Context.conclude()
.
Counter Name | Description |
---|---|
Cluster Container Errors - clusterId=... serviceId=... | |
Cluster container max cycle time in ns |
Updated by the DutyCycleStallTracker |
Cluster container work cycle time exceeded count: threshold=...ns |
Updated by the DutyCycleStallTracker |
Clustered service max snapshot duration in ns | |
Clustered service max snapshot duration exceeded count: threshold=...ns |
Archive¶
Append Position¶
An important Counter in Aeron Cluster is the AppendPosition Counter. This is actually created by the Archive and is
the RecordingPosition rec-pos
Counter.
AeronStat¶
AeronStat is the tool for printing Counter values. There's a sample output below, for reference. The output contains the Counters for Aeron Transport, Archive and Cluster, displayed in creation order.
This was for a leader. You can tell because Counters 120 to 126 are Publication and Sender Counters for the Log streamId 100. Counters 126 and 127 are for the Archive receiving the Log.
Command line:
ADD_OPENS="--add-opens java.base/jdk.internal.misc=ALL-UNNAMED --add-opens java.base/java.util.zip=ALL-UNNAMED"
java -cp aeron-all/build/libs/aeron-all-*.jar -Daeron.dir=/Volumes/DevShm/node-0-driver ${ADD_OPENS} io.aeron.samples.AeronStat watch=false
todo: annotate this and check if anything's missing from the tables above
08:47:38 - Aeron Stat (CnC v0.2.0), pid 89674, heartbeat age 801ms
======================================================================
0: 11,008 - Bytes sent
1: 8,064 - Bytes received
2: 0 - Failed offers to ReceiverProxy
3: 0 - Failed offers to SenderProxy
4: 0 - Failed offers to DriverConductorProxy
5: 0 - NAKs sent
6: 0 - NAKs received
7: 60 - Status Messages sent
8: 92 - Status Messages received
9: 99 - Heartbeats sent
10: 72 - Heartbeats received
11: 0 - Retransmits sent
12: 0 - Flow control under runs
13: 0 - Flow control over runs
14: 0 - Invalid packets
15: 0 - Errors: version=1.47.0 commit=4374d82ead+guilty
16: 0 - Short sends
17: 0 - Failed attempts to free log buffers
18: 0 - Sender flow control limits, i.e. back-pressure events
19: 0 - Unblocked Publications
20: 0 - Unblocked Control Commands
21: 0 - Possible TTL Asymmetry
22: 0 - ControllableIdleStrategy status
23: 0 - Loss gap fills
24: 0 - Client liveness timeouts
25: 0 - Resolution changes: driverName=null
26: 7,938,583 - Conductor max cycle time doing its work in ns: SHARED
27: 0 - Conductor work cycle exceeded threshold count: threshold=1000000000ns SHARED
28: 7,834,250 - Sender max cycle time doing its work in ns: SHARED
29: 0 - Sender work cycle exceeded threshold count: threshold=1000000000ns SHARED
30: 7,835,000 - Receiver max cycle time doing its work in ns: SHARED
31: 0 - Receiver work cycle exceeded threshold count: threshold=1000000000ns SHARED
32: 2,310,416 - NameResolver max time in ns
33: 0 - NameResolver exceeded threshold count
34: 77,568 - Aeron software: version=1.47.0 commit=4374d82ead+guilty
35: 30,334,976 - Bytes currently mapped
36: 0 - Retransmitted bytes
37: 0 - Retransmit Pool Overflow count
38: 0 - Error Frames received
39: 0 - Error Frames sent
40: 1,750,492,058,149 - client-heartbeat: id=1 name=archive version=1.47.0 commit=4374d82ead+guilty
41: 6,532,875 - archive-conductor max cycle time in ns: SHARED - archiveId=1
42: 0 - archive-conductor work cycle time exceeded count: threshold=1000000000ns SHARED - archiveId=1
43: 1 - Archive Control Sessions - archiveId=1
44: 1 - Archive Recording Sessions - archiveId=1
45: 0 - Archive Replay Sessions - archiveId=1
46: 39,584 - archive-recorder max write time in ns - archiveId=1
47: 1,408 - archive-recorder total write bytes - archiveId=1
48: 280,500 - archive-recorder total write time in ns - archiveId=1
49: 0 - archive-replayer max read time in ns - archiveId=1
50: 0 - archive-replayer total read bytes - archiveId=1
51: 0 - archive-replayer total read time in ns - archiveId=1
52: 1 - rcv-channel: aeron:udp?term-length=65536|sparse=true|endpoint=localhost:9001 127.0.0.1:9001
53: 1 - rcv-local-sockaddr: 52 127.0.0.1:9001
54: 1,750,492,058,193 - client-heartbeat: id=17 name=consensus-module-0-0 version=1.47.0 commit=4374d82ead+guilty
55: 0 - Cluster Errors - clusterId=0 version=1.47.0 commit=4374d82ead+guilty
56: 1 - Consensus Module state - clusterId=0
57: 17 - Cluster election state - clusterId=0
58: 1 - Cluster election count - clusterId=0
59: 0 - Cluster leadership term id - clusterId=0
60: 2 - Cluster node role - clusterId=0
61: 1,408 - Cluster commit-pos: - clusterId=0
62: 1 - Cluster control toggle - clusterId=0
63: 1 - Node control toggle - clusterId=0
64: 0 - Cluster snapshot count - clusterId=0
65: 0 - Cluster timed out client count - clusterId=0
66: 31,000,000 - Cluster max cycle time in ns - clusterId=0
67: 0 - Cluster work cycle time exceeded count: threshold=1000000000ns - clusterId=0
68: 0 - Total max snapshot duration in ns - clusterId=0
69: 0 - Total max snapshot duration exceeded count: threshold=1000000000ns - clusterId=0
70: 1 - rcv-channel: aeron:udp?endpoint=localhost:9003|term-length=64k 127.0.0.1:9003
71: 1 - rcv-local-sockaddr: 70 127.0.0.1:9003
72: 160 - pub-pos (sampled): 35 -1167600074 104 aeron:ipc?term-length=128k
73: 65,536 - pub-lmt: 35 -1167600074 104 aeron:ipc?term-length=128k
74: 1,750,492,058,301 - client-heartbeat: id=37 name=clustered-service-0-0 version=1.47.0 commit=4374d82ead+guilty
75: 0 - Cluster Container Errors - clusterId=0 serviceId=0 version=1.47.0 commit=4374d82ead+guilty
76: 25,232,833 - Cluster container max cycle time in ns - clusterId=0 serviceId=0
77: 0 - Cluster container work cycle time exceeded count: threshold=1000000000ns - clusterId=0 serviceId=0
78: 0 - Clustered service max snapshot duration in ns - clusterId=0 serviceId=0
79: 0 - Clustered service max snapshot duration exceeded count: threshold=1000000000 - clusterId=0 serviceId=0
80: 512 - pub-pos (sampled): 43 1581412165 10 aeron:ipc?session-id=1581412165|term-length=64k|sparse=true|mtu=1408|alias=cm-archive-ctrl-req-cluster-0
81: 32,768 - pub-lmt: 43 1581412165 10 aeron:ipc?session-id=1581412165|term-length=64k|sparse=true|mtu=1408|alias=cm-archive-ctrl-req-cluster-0
82: 512 - sub-pos: 16 1581412165 10 aeron:ipc?term-length=64k @0
83: 192 - pub-pos (sampled): 44 -1167600073 105 aeron:ipc?term-length=128k
84: 65,536 - pub-lmt: 44 -1167600073 105 aeron:ipc?term-length=128k
85: 192 - sub-pos: 34 -1167600073 105 aeron:ipc?term-length=128k @0
86: 160 - sub-pos: 46 -1167600074 104 aeron:ipc?term-length=128k @0
87: 960 - pub-pos (sampled): 49 1581412165 20 aeron:ipc?mtu=1408|term-length=65536|session-id=1581412165|alias=cm-archive-ctrl-resp-cluster-0|sparse=true
88: 32,768 - pub-lmt: 49 1581412165 20 aeron:ipc?mtu=1408|term-length=65536|session-id=1581412165|alias=cm-archive-ctrl-resp-cluster-0|sparse=true
89: 960 - sub-pos: 36 1581412165 20 aeron:ipc?session-id=1581412165|term-length=64k|sparse=true|mtu=1408|alias=cm-archive-ctrl-resp-cluster-0 @0
95: 672 - sub-pos: 76 -2066571664 101 aeron:udp?endpoint=localhost:9002|term-length=64k @0
96: 672 - rcv-hwm: 77 -2066571664 101 aeron:udp?endpoint=localhost:9002|term-length=64k
97: 1 - snd-channel: aeron:udp?endpoint=localhost:9103|term-length=64k 127.0.0.1:59392
98: 1 - snd-local-sockaddr: 97 127.0.0.1:59392
99: 2,592 - pub-pos (sampled): 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
100: 35,360 - pub-lmt: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
101: 2,592 - snd-pos: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
102: 32,768 - snd-lmt: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
103: 0 - snd-bpe: 61 -1167600072 108 aeron:udp?endpoint=localhost:9103|term-length=64k
104: 2,080 - sub-pos: 33 659118749 108 aeron:udp?endpoint=localhost:9003|term-length=64k @0
105: 2,080 - rcv-hwm: 63 659118749 108 aeron:udp?endpoint=localhost:9003|term-length=64k
106: 2,080 - rcv-pos: 63 659118749 108 aeron:udp?endpoint=localhost:9003|term-length=64k
107: 2,080 - sub-pos: 33 -1307509005 108 aeron:udp?endpoint=localhost:9003|term-length=64k @0
108: 2,080 - rcv-hwm: 64 -1307509005 108 aeron:udp?endpoint=localhost:9003|term-length=64k
109: 2,080 - rcv-pos: 64 -1307509005 108 aeron:udp?endpoint=localhost:9003|term-length=64k
110: 1 - snd-channel: aeron:udp?endpoint=localhost:9203|term-length=64k 127.0.0.1:60072
111: 1 - snd-local-sockaddr: 110 127.0.0.1:60072
112: 2,592 - pub-pos (sampled): 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
113: 35,360 - pub-lmt: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
114: 2,592 - snd-pos: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
115: 32,768 - snd-lmt: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
116: 0 - snd-bpe: 62 -1167600071 108 aeron:udp?endpoint=localhost:9203|term-length=64k
117: 1 - snd-channel: aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log 0.0.0.0:58020
118: 1 - snd-local-sockaddr: 117 0.0.0.0:58020
119: 2 - mdc-num-dest: aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
120: 2 - fc-receivers: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
121: 1,408 - pub-pos (sampled): 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
122: 1,049,984 - pub-lmt: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
123: 1,408 - snd-pos: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
124: 131,072 - snd-lmt: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
125: 0 - snd-bpe: 68 -1167600070 100 aeron:udp?term-id=0|term-length=2m|tags=67,66|term-offset=0|control-mode=manual|ssc=false|init-term-id=0|alias=log
126: 1,408 - sub-pos: 72 -1167600070 100 aeron-spy:aeron:udp?tags=67|session-id=-1167600070|alias=log @0
127: 1,408 - rec-pos: 0 -1167600070 100 aeron:udp?tags=67|session-id=-1167600070|alias=log - archiveId=1
128: 1,408 - sub-pos: 74 -1167600070 100 aeron-spy:aeron:udp?tags=67|session-id=-1167600070|alias=log-sc-0 @0
129: 1 - rcv-channel: aeron:udp?endpoint=localhost:9002|term-length=64k 127.0.0.1:9002
130: 1 - rcv-local-sockaddr: 129 127.0.0.1:9002
131: 672 - rcv-pos: 77 -2066571664 101 aeron:udp?endpoint=localhost:9002|term-length=64k
132: 1 - snd-channel: aeron:udp?endpoint=localhost:50779 127.0.0.1:65441
133: 1 - snd-local-sockaddr: 132 127.0.0.1:65441
134: 576 - pub-pos (sampled): 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
135: 524,864 - pub-lmt: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
136: 576 - snd-pos: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
137: 131,072 - snd-lmt: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
138: 0 - snd-bpe: 78 -1167600069 102 aeron:udp?endpoint=localhost:50779
139: 768 - sub-pos: 76 1927135809 101 aeron:udp?endpoint=localhost:9002|term-length=64k @0
140: 768 - rcv-hwm: 79 1927135809 101 aeron:udp?endpoint=localhost:9002|term-length=64k
141: 768 - rcv-pos: 79 1927135809 101 aeron:udp?endpoint=localhost:9002|term-length=64k
142: 1 - snd-channel: aeron:udp?endpoint=localhost:55122 127.0.0.1:63602
143: 1 - snd-local-sockaddr: 142 127.0.0.1:63602
144: 672 - pub-pos (sampled): 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
145: 524,960 - pub-lmt: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
146: 672 - snd-pos: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
147: 131,072 - snd-lmt: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
148: 0 - snd-bpe: 81 -1167600067 102 aeron:udp?endpoint=localhost:55122
--