Monitoring Kafka

By Naveen | Last updated on October 6, 2023 | 8929 Views

How to monitor Kafka?

Yammer Metrics is used for reporting purposes between the brokers and the clients.

Description	Value
No of under replicated partitions	0
It means that the server is on active control	Only one server is 1
Leader election rate (LER)	Server failure not 0
Unclean LER	0
Partition counts	Mostly even near servers
Leader replica counts	Mostly even near servers
ISR shrinks rate	Normally the ISR and expansion is 0. The partitions will shrink if the server powers down. Again, that partition will be expanded once the replicas of the servers are up.
ISR expansion rate	Same as above
Max lags in messages btw follower and leader replicas	Lag must be proportional to the size of a request made by the producer
Lag in messages per follower replica	Lag must be proportional to the size of a request made by the producer
If such condition then requests wait near the producer	non-zero if ack=-1 is used
It is the span for which the request will be waiting	During the producer request if act =-1
The processors will be constant	When time is greater than 0.3 and between 0 and 1
Here the request handler threads will be constant	When time is greater than 0.3 and between 0 and 1
Quota metrics per client-id	Throttle-time is the time for which the client-id is throttled i.e it is 0, and the byte-rate is the rate at which the data is produced or consumed in bytes/sec

Kafka Spark Streaming Tutorial Video:

New producer Monitoring

Description
These are the threads that were blocked and are waiting to add their records by the buffer memory	Waiting-threads
The largest buffer than can be used by the client	Buffer-total-bytes
	buffer-available-bytes
It indicates the overall usable buffer memory
This is the time for which the fixer waits for the space assigned	bufferpool-wait-time
Bytes count for each partition and for each request made by the partition	batch-size-avg
Maximum bytes for each partition and on each request	batch-size-max
It is the rate of compression in average count	compression-rate-avg
Time spend by record in average	record-queue-time-avg
Highest time spend by the record	record-queue-time-max
The rate at which the record is retried	record-retry-rate
It is the rate at which the record error occurs	record-error-rate
It is the largest size any record can be of	record-size-max
It the average size is any record	record-size-avg
It is the age of the present metadata	metadata-age
It is the rate at which the connection can be closed	connection-close-rate
It is the rate at which the connection can be created	connection-creation-rate
The rate of the network operations	network-io-rate
The rate at which the bytes are outgoing	outgoing-byte-rate
The rate at which the requests are sent	request-rate
It is the average size of all the requests that is sent	request-size-avg
It is the largest size any request is sent	request-size-max
It is the rate at which the bytes enter	incoming-byte-rate
The rate at which the responses are obtained	response-rate
It is the rate of selection of input,output performance	select-rate
It is the average time for which the input output waits	io-wait-ratio
It is the average time for which the input, output call in ns	io-time-ns-avg
It is the time for which the input, output	io-ratio
thread spends
It is the number of present active connections	connection-count
It is the number of bytes send with respect to the time	outgoing-byte-rate
It is the rate of requests sent in each second for a node	request-rate
It is the average of the size of the requests	request-size-avg
The largest size any request can be of	request-size-max
It is the rate in which the responses are obtained	incoming-byte-rate
It is the average of the request latency	request-latency-avg
It is the maximum of the same	request-latency-max
It is the rate at which the answers to the requests are obtained	response-rate
It is the rate at which the records are sent to the Topic	record-send-rate
It is the rate at which the bytes are sent to the Topic	byte-rate
For a topic it is the rate at which the records are compressed	compression-rate
It is the rate at which the records are tried again to be sent to the Topic	record-retry-rate
It is the rate at which error occurs when records are being sent to the Topic	record-error-rate
It is the maximum time in which the request can be throttled by the server	produce-throttle-time-max
It is the average amount of time any request can be throttled by the server.	produce-throttle-time-avg

Always keep in mind that for a consumer to be in good position, keep the max lag less than the threshold and fetching rate should be always larger than 0.

Course Schedule

Name	Date	Details
Big Data Course	27 Apr 2024(Sat-Sun) Weekend Batch	View Details
Big Data Course	04 May 2024(Sat-Sun) Weekend Batch	View Details
Big Data Course	11 May 2024(Sat-Sun) Weekend Batch	View Details

Find Big Data Hadoop Training in Other Regions

Bangalore Melbourne Chicago Hyderabad San Francisco London Toronto New York India Los Angeles Sydney Dubai Pune Houston Singapore Delhi Mumbai Chennai Noida Bhubaneswar Kolkata Visakhapatnam Jersey City Kuala Lumpur Coimbatore Denver Fremont Irving San Diego Seattle Sunnyvale Washington Philadelphia Boston Austin Phoenix Mountain View Atlanta Dallas Columbus Ashburn Charlotte San Jose