Original post

https://media.giphy.com/media/tJMVcTfzDdL1pOGxlk/giphy.gif

Hi there,

we’ve had a bug in our production system backed by Kafka. All of our topics are compacted, hence we are relying on message rebalancing based on the keys to ensure beeing all messages with the same key ending up in the same partition. Caused by the bug, we’ve accidentially produced messages with the round robin balancer, so we’ve lost the compaction and the order guarantee.

Luckily our messages had a timestamp, so we’ve wrote a tool to read all messages from a topic, order them by the timestamp and republish them on a new topic. This tool might be handy for changing the number of the partitions aswell. What do you think?

https://github.com/tarent/kafka-rebalancer

This is a very early alpha version with a very low test coverage, so I’m asking for early feedback on the design of the API, the use cases and maybe missing features. Maybe there are common cases, which aren’t covered by our tool (as we’ve developed it for our current problem).

Cheers