Apache Kafka

The following alerts are available for Apache Kafka. Default settings for warning and alarm thresholds, duration and whether the alert is enabled (true/false) are shown.

 

Alert Name

WARN.

LEVEL

ALARMLEVEL

DURATION

ENABLED

KafkaBrokerBytesInPerSecHigh

The number of incoming bytes per second exceeds the defined threshold for the broker.

Index Type(s): PerKafkaServer

1600

2000

30

FALSE

KafkaBrokerBytesOutPerSecHigh

The number of outgoing bytes per second exceeds the defined threshold for the broker.

Index Type(s): PerKafkaServer

1600

2000

30

FALSE

KafkaBrokerCpuPercentHigh

The CPU percentage reported by the JVM is above the limits defined for that broker.

Index Type(s): PerKafkaServer

50

75

30

FALSE

KafkaBrokerExpired

The Kafka Broker is not responding.

Index Type(s): PerKafkaServer

NaN

NaN

30

FALSE

KafkaBrokerFetchRequestsPerSecHigh

Fetch requests per second exceeds threshold for the broker.

Index Type(s): PerKafkaServer

1600

2000

30

FALSE

KafkaBrokerLogFlushLatency95PHigh

The current log flush latency exceeds the 95th percentile.

Index Type(s): PerKafkaServer

1600

2000

30

FALSE

KafkaBrokerMemoryUsedPercentHigh

The percentage of heap memory used relative to the maximum heap available is above the limits defined for that broker.

Index Type(s): PerKafkaServer

50

75

30

FALSE

KafkaBrokerMsgsInPerSecHigh

The number of incoming messages per second exceeds the defined threshold for the broker.

Index Type(s): PerKafkaServer

1600

2000

30

TRUE

KafkaBrokerNetProcAvgIdlePctHigh

The average percent idle for the network processor exceeds the threshold.

Index Type(s): PerKafkaServer

1

.3

30

FALSE

KafkaBrokerNetProcAvgIdlePctLow

The average percent idle for the network processor is bellow the threshold.

Index Type(s): PerKafkaServer

.05

.3

30

FALSE

KafkaBrokerOfflinePartitionCountHigh

The number of partitions without an active leader is not zero.

Index Type(s): PerKafkaServer

NaN

1

30

TRUE

KafkaBrokerProduceRequestsPerSecHigh

Produce requests per second exceeds threshold for the broker.

Index Type(s): PerKafkaServer

1600

2000

30

FALSE

KafkaBrokerUncInLderElecsPerSecHigh

The available replicas were not in sync during leader election. Data loss has probably occurred.

Index Type(s): PerKafkaServer

1600

2000

30

FALSE

KafkaBrokerUnderReplicatedPartnsHigh

The number of under-replicated partitions is not zero.

Index Type(s): PerKafkaServer

NaN

1

30

FALSE

KafkaClusterLeadersUnbalancedHigh

The partition leaders for the cluster are not evenly distributed across the available brokers.

Index Type(s): PerKafkaCluster

10

10

30

FALSE

KafkaClusterNoActiveController

There is more than one active controller per cluster, which could indicate a split-brain error.

Index Type(s): PerKafkaCluster

NaN

NaN

30

FALSE

KafkaClusterPartitionsUnbalancedHigh

Partitions supported by the cluster are not evenly distributed across the available brokers.

Index Type(s): PerKafkaCluster

10

1

30

FALSE

KafkaClusterSplitBrain

One (or more) zookeeper/broker is not acting as part of the main cluster.

Index Type(s): PerKafkaCluster

NaN

NaN

30

FALSE

KafkaClusterTopicReplicasOutOfSync

Topic partition replicas are out of sync.

Index Type(s): PerKafkaCluster

5

10

30

FALSE

KafkaoConsumerBytesPerSecHigh

The consumer message load (bytes per second) exceeds the threshold.

Index Type(s): PerKafkaConsumer

1600

2000

30

FALSE

KafkaoConsumerCpuPercentHigh

The CPU percentage reported by the JVM is above the limits defined for that consumer.

Index Type(s): PerKafkaConsumer

50

75

30

FALSE

KafkaConsumerExpired

The consumer is not responding.

Index Type(s): PerKafkaConsumer

NaN

NaN

30

FALSE

KafkaConsumerFetchLatencyHigh

The consumer fetch latency exceeds the threshold.

Index Type(s): PerKafkaConsumer

1600

2000

30

TRUE

KafkaConsumerFetchRateHigh

The consumer is pulling records from Kafka at a slower than expected rate.

Index Type(s): PerKafkaConsumer

1600

2000

30

FALSE

KafkaConsumerLagHigh

The consumer is falling too far behind the producer.

Index Type(s): PerKafkaConsumer

1600

2000

30

TRUE

KafkaConsumerLagIncreasing

The consumer lag rate of change is greater than zero for the specified duration, which could mean that lag is steadily increasing.

Index Type(s): PerKafkaConsumer

NaN

NaN

300

FALSE

KafkaConsumerMemoryUsedPercentHigh

The percentage of heap memory used relative to the maximum heap available is above the limits defined for that consumer.

Index Type(s): PerKafkaConsumer

50

75

300

FALSE

KafkaConsumerPartitionStalled

The consumer lag delta is not negative and the current offset delta is positive for the defined duration for a topic on a partition, which could mean that new messages are being added to the partition but the consumer is not reading them.

Index Type(s): PerKafkaConsumer

NaN

NaN

300

FALSE

KafkaConsumerRecordsConsumedRateHigh

The consumer message load (messages per second) exceeds the threshold.

Index Type(s): PerKafkaConsumer

1600

2000

30

TRUE

KafkaConsumerSlow

This alert is triggered for a topic when consumer lag delta is not negative and the current offset delta is positive for the specified duration, which could mean that the consumer is slow in reading messages.

Index Type(s): PerKafkaConsumer

NaN

NaN

300

FALSE

KafkaProducerCpuPercentHigh

The CPU percentage reported by the JVM is above the limits defined for that producer.

Index Type(s): PerKafkaProducer

50

75

30

FALSE

KafkaProducerExpired

The producer is not responding.

Index Type(s): PerKafkaProducer

NaN

NaN

30

FALSE

KafkaProducerIncomingByteRateHigh

The producer’s incoming byte rate exceeds the threshold.

Index Type(s): PerKafkaProducer

1600

2000

30

TRUE

KafkaProducerIoWaitTimeMSHigh

The producer is waiting for IO longer than expected (on average).

Index Type(s): PerKafkaProducer

1600

2000

30

FALSE

KafkaProducerMemoryUsedPercentHigh

The percentage of heap memory used relative to the maximum heap available is above the limits defined for that producer.

Index Type(s): PerKafkaProducer

50

75

30

FALSE

KafkaProducerOutgoingByteRateHigh

The producer output byte rate exceeds the threshold.

Index Type(s): PerKafkaProducer

1600

2000

30

TRUE

KafkaProducerRecordSendRateHigh

The producer record send rate exceeds the threshold.

Index Type(s): PerKafkaProducer

1600

2000

30

TRUE

KafkaProducerRequestLatencyHigh

The producer request latency exceeds the threshold.

Index Type(s): PerKafkaProducer

1600

2000

30

TRUE

KafkaProducerRequestRateHigh

The producers request rate exceeds the threshold.

Index Type(s): PerKafkaProducer

1600

2000

30

TRUE

KafkaProducerResponseRateHigh

The producer response rate exceeds the threshold.

Index Type(s): PerKafkaProducer

1600

2000

30

TRUE

KafkaZookeeperAvgLatencyHigh

The average time for the zookeeper to respond to a request exceeds the threshold.

Index Type(s): PerKafkaZookeeper

1600

2000

30

TRUE

KafkaZookeeperCpuPercentHigh

The CPU percentage reported by the JVM is above the limits defined for that zookeeper.

Index Type(s): PerKafkaZookeeper

50

75

30

TRUE

KafkaZookeeperExpired

The zookeeper is not responding.

Index Type(s): PerKafkaZookeeper

NaN

NaN

30

FALSE

KafkaZookeeperMemoryUsedPercentHigh

The percentage of heap memory used relative to the maximum heap available is above the limits defined for that Zookeeper.

Index Type(s): PerKafkaZookeeper

50

75

30

FALSE

KafkaZookeeperNumAliveConnsHigh

The total number of connections to a given zookeeper exceeds the threshold.

Index Type(s): PerKafkaZookeeper

1600

2000

30

TRUE

KafkaZookeeperOutstandingReqsHigh

Clients are making requests faster than the zookeeper can process them.

Index Type(s): PerKafkaZookeeper

1600

2000

30

FALSE

KafkaZookeeperRatePktsRcvdHigh

The rate that the zookeeper is receiving packets exceeds the threshold.

Index Type(s): PerKafkaZookeeper

1600

2000

30

TRUE

KafkaZookeeperRatePktsSentHigh

The rate that the zookeeper is sending packets exceeds the threshold.

Index Type(s): PerKafkaZookeeper

1600

2000

30

TRUE