You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Found by Kafka performance test failing after #24879 in. This is because we removed shuffle=appliance there and then read from shuffle becomes faster. However, the producer is unable to digests data within timeout (can be seen by the flooding warning log of send failed : 'Expiring 148 record(s) for beam-sdf-0:130675 ms has passed since batch creation'). However, the message itself does not contain a timestamp and should be tolerant to throttling.
The error happens because producer.send is asynchronous. It adds a system timestamp when .send gets called and returns immediately. We need a mechanism to prevent overwhelming Kafka producer.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
@aromanenko-dev unfortunately I do not have a good idea for short term fix. Throttling detection is considered as a long term solution, that is the IO connector has can detect throttling and make runner aware, then runner can prevent scaling up and possibly scaling down.
What happened?
Found by Kafka performance test failing after #24879 in. This is because we removed shuffle=appliance there and then read from shuffle becomes faster. However, the producer is unable to digests data within timeout (can be seen by the flooding warning log of
send failed : 'Expiring 148 record(s) for beam-sdf-0:130675 ms has passed since batch creation'
). However, the message itself does not contain a timestamp and should be tolerant to throttling.The error happens because producer.send is asynchronous. It adds a system timestamp when .send gets called and returns immediately. We need a mechanism to prevent overwhelming Kafka producer.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: