Skip to main content
Star us on GitHub Star

Metrics

OpenZiti systems provide a wide range of metrics for the monitoring of the network services, endpoints, and processes. Some of the various metrics are visualized below to understand where they fall and what they measure in a network instance. The bulk of the remaining metrics are measuring processes within the control plane, rather than network operation.

Available Metrics

Metrics are reported to the log files, locale in /var/log/ziti by default. There are 2 primary log files for metrics, utilization-metrics.log and utilization-usage.log. These logs may be shipped to various reporting systems for easier visibility and monitoring.

MetricTypeSourceDescription
api-session.createHistogramcontrollerTime to create api sessions
api.session.enforcer.runTimercontrollerHow long it takes the api session policy enforcer to run
bolt.open_read_txsGaugecontrollerCurrent number of open bbolt read transactions
ctrl.latencyHistogramcontrollerPer control channel latency
ctrl.queue_timeHistogramcontrollerPer control channel queue time (between send and write to wire)
ctrl.rx.bytesrateMetercontrollerPer control channel receive data rate
ctrl.rx.msgrateMetercontrollerPer control channel receive message rate
ctrl.rx.msgsizeHistogramcontrollerPer control channel receive message size distribution
ctrl.tx.bytesrateMetercontrollerPer control channel send data rate
ctrl.tx.msgrateMetercontrollerPer control channel send message rate
ctrl.tx.msgsizeHistogramcontrollerPer control channel send messsage size distribution
edge.invalid_api_tokensMeterrouterNumber of invalid api session token encountered
edge.invalid_api_tokens_during_syncMeterrouterNumber of invalid api session token encountered while a sync is in progress
egress.rx.bytesrateMeterrouterData rate of data received via xgress, originating from terminators. Per router.
egress.rx.msgrateMeterrouterMessage rate of data received via xgress, originating from terminators. Per router.
egress.rx.msgsizeHistogramrouterMessage size distribution of data received via xgress, originating from terminators. Per router.
egress.tx.bytesrateMeterrouterData rate of data sent via xgress originating from terminators. Per router.
egress.tx.msgrateMeterrouterMessage rate of data sent via xgress originating from terminators. Per router.
egress.tx.msgsizeHistogramrouterMessage size distribution of data sent via xgress, originating from terminators. Per router.
eventual.eventsGaugecontrollerNumber of background events pending processing
fabric.rx.bytesrateMeterrouterData rate of data received from fabric links
fabric.rx.msgrateMeterrouterMessage rate of data received from fabric links
fabric.rx.msgsizeHistogramrouterMessage size distribution of data received from fabric links
fabric.tx.bytesrateMeterrouterData rate of data sent on fabric links
fabric.tx.msgrateMeterrouterMessage rate of data sent on fabric links
fabric.tx.msgsizeHistogramrouterMessage size distribution of data sent on fabric links
identity.refreshMetercontrollerHow often an identity is marked, indicating that they need a full refresh of their service list
identity.update-sdk-infoHistogramcontrollerTime to update identity sdk info
ingress.rx.bytesrateMeterrouterData rate of data received via xgress, originating from initiators. Per router.
ingress.rx.msgrateMeterrouterMessage rate of data received via xgress, originating from initiators. Per router.
ingress.rx.msgsizeHistogramrouterMessage size distribution of data received via xgress, originating from initiators. Per router.
ingress.tx.bytesrateMeterrouterData rate of data sent via xgress originating from initiators. Per router.
ingress.tx.msgrateMeterrouterMessage rate of data sent via xgress originating from initiators. Per router.
ingress.tx.msgsizeHistogramrouterMessage size distribution of data sent via xgress, originating from initiators. Per router.
link.latencyHistogramcontrollerPer link latency in nanoseconds
link.queue_timeHistogramcontrollerPer link queue time (between send and write to wire)
link.rx.bytesrateMetercontrollerPer link receive data rate
link.rx.msgrateMetercontrollerPer link receive message rate
link.rx.msgsizeHistogramcontrollerPer link receive message size distribution
link.tx.bytesrateMetercontrollerPer link send data rate
link.tx.msgrateMetercontrollerPer link send message rate
link.tx.msgsizeHistogramcontrollerPer link send messsage size distribution
service.policy.enforcer.runTimercontrollerHow long it takes the service policy enforcer to run
service.policy.enforcer.run.deletesMetercontrollerHow many sessions are deleted by the service policy enforcer
services.listHistogramcontrollerTime to list services
session.createHistogramcontrollerTime to create a session
xgress.ack_duplicatesMeterrouterNumber of duplicate acks received. Indicates over-eager retransmission
xgress.ack_failuresMeterrouterNumber of failures sending acks
xgress.acks.queue_sizeGaugerouterNumber of acks queued to send
xgress.blocked_by_local_windowGaugerouterNumber of xgress instances blocked because the windowing threshold has been exceeded locally
xgress.blocked_by_remote_windowGaugerouterNumber of xgress instances blocked because the windowing threshold has been exceeded remotely
xgress.dropped_payloadsMeterrouterNumber of payloads dropped because the xgress receiver side couldn't keep up
xgress.retransmission_failuresMeterrouterNumber of retransmission send failures
xgress.retransmissionsMeterrouterNumber of payloads retransmitted
xgress.retransmits.queue_sizeGaugerouterNumber of payloads queued for retransmission
xgress.rx.acksMeterrouterNumber of acks received
xgress.tx.acksMeterrouterNumber of acks sent
xgress.tx_unacked_payload_bytesGaugerouterTotal payload data size that has been buffered but not acked yet
xgress.tx_unacked_payloadsGaugerouterNumber of payload messages that have been buffered but not yet acked
xgress.tx_write_timeTimerrouterTime to write payloads to the xgress receiver