apache-kafka
Kafka10.1 heartbeat.interval.ms, session.timeout.ms and max.poll.interval.ms
Assuming we are talking about Kafka 0.10.1.0 or upwards where each consumer instance employs two threads to function. One is user thread from which poll is called; the other is heartbeat thread that specially takes care of heartbeat things. session.timeout.ms is for heartbeat thread. If coordinator fails to get any heartbeat from a consumer before … Read more
How to delete multiple topics in Apache Kafka
Yes you can use regex-like expressions when deleting topics with the kafka-topics.sh tool: For example, to delete all topics starting with giorgos-: ./bin/kafka-topics.sh –zookeeper localhost:2181 –delete –topic ‘giorgos-.*’ Using the Admin APIs, you can also delete several topics at once, see AdminClient.deleteTopics
How many Kafka controllers are there in a cluster and what is the purpose of a controller?
The controller is one of the Kafka brokers that is also responsible for the task of electing partition leaders (in addition to the usual broker functionality). Is the controller just one broker? There is only 1 controller at a time. Going internally, each broker tries to create an ephemeral node in the zookeeper (/controller). The … Read more
How to minimize the latency involved in kafka messaging framework?
I am not trying to dodge the question but I think that kafka is a poor choice for this use case. While I think Kafka is great (I have been a huge proponent of its use at my workplace), its strength is not low-latency. Its strengths are high producer throughput and support for both fast … Read more
Kafka how to read from __consumer_offsets topic
I came across this question when trying to also consume from the __consumer_offsets topic. I managed to figure it out for different Kafka versions and thought I’d share what I’d found For Kafka 0.8.2.x Note: This uses Zookeeper connection #Create consumer config echo “exclude.internal.topics=false” > /tmp/consumer.config #Consume all offsets ./kafka-console-consumer.sh –consumer.config /tmp/consumer.config \ –formatter “kafka.server.OffsetManager\$OffsetsMessageFormatter” … Read more
How to get data from old offset point in Kafka?
The consumers belong always to a group and, for each partition, the Zookeeper keeps track of the progress of that consumer group in the partition. To fetch from the beginning, you can delete all the data associated with progress as Hussain refered ZkUtils.maybeDeletePath(${zkhost:zkport}”, “/consumers/${group.id}”); You can also specify the offset of partition you want, as … Read more
Kafka – difference between Log end offset(LEO) vs High Watermark(HW)
The high watermark indicates the offset of messages that are fully replicated, while the end-of-log offset might be larger if there are newly appended records to the leader partition which are not replicated yet. Consumers can only consume messages up to the high watermark. See this blog post for more details: http://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/
Kafka Consumer default Group Id
if I don’t set any group id in the Consumer Properties, what group id will the Kafka Consumer be given? The kafka consumer will not have any consumer group. Instead you will get this error : The configured groupId is invalid Is there a single default value? Yes, you can see the consumer.properties file of … Read more
Kafka consumer fetching metadata for topics failed
The broker tells the client which hostname should be used to produce/consume messages. By default Kafka uses the hostname of the system it runs on. If this hostname can not be resolved by the client side you get this exception. You can try setting advertised.host.name in the Kafka configuration to an hostname/address which the clients … Read more