Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputs/redpanda: add fetch_max_wait option #3100

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

birdayz
Copy link

@birdayz birdayz commented Dec 26, 2024

kgo.FetchMaxWait is a config option supported by franz-go. It makes it possible to use kgo.FetchMinBytes to force big batches, but have a rather low max wait time to fill the batch. This makes it possible to force the broker to send big batches if possible, but still wait only for a short time if there's not enough data.

Why add this option now?
This is especially important with the redpanda input, as it's using ordered franz-go. It will only send batches, if the previous batch with the partition has been consumed. If the broker keeps sending very small batches, e.g. size 1, it's likely to stall batched outputs. I could reproduce locally by using a producer that sends lots of batches of size 1.
I tried to overcome this ordering limitation by using batching in my output, but it doesn't work in this specific case. It will only add more records to the batch, if the previous batch of the partition was consumed, so in the extreme case of getting one record per kafka batch, for only one partition, i can't overcome it, rpcn will do only one record at a time.

Using kgo.FetchMinBytes in combination with kgo.FetchMaxWait can solve this problem.
But in any case, it is a useful tuning knob offered by franz-go, but also the standard Java client.

@birdayz birdayz requested a review from Jeffail December 26, 2024 18:20
kgo.FetchMaxWait is a config option supported by franz-go. It makes it
possible to use kgo.FetchMinBytes, but have a rather low max wait time
to fill the batch. This makes it possible to force the broker to send
big batches if possible, but still wait only for a short time if there's
not enough data.

This is especially important with the redpanda input, as it's using
ordered franz-go. It will only send batches, if the previous batch with
the partition has been consumed. If the broker keeps sending very small
batches, e.g. size 1, it's likely to stall batched outputs. I could
reproduce locally by using a producer that sends lots of batches of size
1.

Using kgo.FetchMinBytes in combination with kgo.FetchMaxWait can solve this
problem.
But in any case, it is a useful tuning knob offered by franz-go, but
also the standard Java client.
@birdayz birdayz force-pushed the jb/redpanda-input-fetch-max-wait branch from e66b830 to 416ada1 Compare December 26, 2024 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant