Counting Number of messages stored in a kafka topic

get number of messages in kafka topic java
count total messages in kafka topic java
python kafka number of messages in topic
how to check kafka topic messages
kafka list topics
kafka delete --topic
kafka message format
kafka manager messages count

I'm using 0.9.0.0 version of Kafka and I want to count the number of messages in a topic without using the admin script kafka-console-consumer.sh.

I have tried all the commands in the answer Java, How to get number of messages in a topic in apache kafka but none are yielding the result. Can anyone help me out here?

You could try to execute the command below:

bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092,localhost:9093,localhost:9094 --topic test-topic --time -1

Then, sum up all the counts for each partition.

Updated: Java implementation

Properties props = new Properties();
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
......
try (final KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props)) {
    consumer.subscribe(Arrays.asList("your_topic"));
    Set<TopicPartition> assignment;
    while ((assignment = consumer.assignment()).isEmpty()) {
        consumer.poll(Duration.ofMillis(100));
    }
    final Map<TopicPartition, Long> endOffsets = consumer.endOffsets(assignment);
    final Map<TopicPartition, Long> beginningOffsets = consumer.beginningOffsets(assignment);
    assert (endOffsets.size() == beginningOffsets.size());
    assert (endOffsets.keySet().equals(beginningOffsets.keySet()));

    Long totalCount = beginningOffsets.entrySet().stream().mapToLong(entry -> {
            TopicPartition tp = entry.getKey();
            Long beginningOffset = entry.getValue();
            Long endOffset = endOffsets.get(tp);
            return endOffset - beginningOffset;
        }).sum();
    System.out.println(totalCount);
}

Counting Messages in a Topic, I'm using 0.9.0.0 version of Kafka and I want to count the number of messages in a topic without using the admin script kafka-console-consumer.sh. I have tried  Kafka; KAFKA-1197; Count of bytes or messages of a topic stored in kafka. Log In. Export

you can sum up all counts by using this :

.../bin/kafka-run-class kafka.tools.GetOffsetShell --broker-list <<broker_1>>:9092,<<broker_2:9092>>... --topic <<your_topic_name>> --time -1 | while IFS=: read topic_name partition_id number; do echo "$number"; done | paste -sd+ - | bc

get kafka topic message count · GitHub, At some point in your journey with Apache Kafka®, you might ask, “How many messages are in that topic?” You know enough to avoid using just the latest  The requirement is to count the number of messages in a Kafka topic using Scala or Spark programming. I am new to both the programming so i am not sure how can it be done. Can anybody help me with the code or guide me how can it be achieved.

Technically speaking you can simply consume all messages from the topic and count them:

Example:

kafka-run-class.sh kafka.tools.SimpleConsumerShell --broker-list localhost:9092 --topic XYZ --partition 0*

However kafka.tools.GetOffsetShell approach will give you the offsets and not the actual number of messages in the topic. It means if the topic gets compacted you will get two differed numbers if you count messages by consuming them or by reading offsets.

Topic compaction: https://kafka.apache.org/documentation.html#design_compactionbasics

Quick command reference for Apache Kafka · GitHub, get kafka topic message count. kafka_topic_msg_count.sh. kafka-run-class kafka.​tools.GetOffsetShell --broker-list localhost:9092 --topic xxx --time -1 --offsets 1  Is there a way to check no of messages in kafka topic from shell command line? Thanks. apache-kafka. Counting Number of messages stored in a kafka topic. 0.

You can also do this using awk and a simple loop

for i in `kafka-run-class kafka.tools.GetOffsetShell --broker-list broker:9092 --time -1 --topic topic_name| awk -F : '{print $3}'`; do sum=$(($sum+$i)); done

Documentation - Apache Kafka, Get number of messages in a topic ??? bin/kafka-run-class.sh kafka.tools.​GetOffsetShell --broker-list localhost:9092 --topic mytopic --time -1 --offsets 1 | awk -F  As you can see on the red box, 999 is the number of message currently in the topic. Update: ConsumerOffsetChecker is deprecated since 0.10.0, you may want to start using ConsumerGroupCommand. Questions: Answers: Use https://prestodb.io/docs/current/connector/kafka-tutorial.html.

Using Kafka Command-line Tools, A consumer instance sees messages in the order they are stored in the log. num.partitions, 1, The default number of partitions per topic if a partition count isn'​t  5. As new messages come into the original topic, the message count will update and emit another output message. If you leave the SELECT statement running you will see this value increase as time passes because new messages are being added to the topic in the background.

Kafka: What is Current Offset or Record Count of Topic?, kafka-consumer-offset-checker. Check the number of messages read and written, as well as the lag for each consumer in a specific consumer  events-counter is a simple utility that can be used for accounting messages in a Kafka topic. It supports accounting by number of bytes (regardless the numer of messages). Messages are expected to be on JSON format when counting messages. events-counter has two independent modules: UUIDCounter. Reads messages from a set of input topics. Count

kafka.consumer package, I need the number of messages in a kafka topic stored. This is not Message count also depends on the partitions' beginning offsets for that topic. You could run  Partitions are the units of storage in Kafka for messages. And Topic can be thought of as being a container in which these partitions lie.

Comments
  • Do you want it to work for compacted topics as well because that eliminates a bunch of options like comparing the beginning and lasted offsets.
  • See my answer here for a solution using the Java client.
  • You should rather compute the sum of diffs between latest and earliest offsets. (--time -2) param gives the earliest ones.
  • Can you please provide a java implementation of the same thing?
  • Thanks! A bit simpler summing: kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list $KAFKA_CLUSTER_HOSTS --topic $TOPIC_NAME --time -1 | tr ":" " " | awk '{ sum += $3 } END { print sum }'
  • @ozma instead of tr you can also use awk -F: :D
  • Reading off potentially untold (millions?) of messages off a topic in Kafka (which are persistent until purged - not like JMS - persistent until read) is not viable unless time is not relative.
  • which count could be potentially higher, the offset number or the number of messages consumed? I guess the first?