Even with exactly-once delivery enabled, the state-store is not kep in sync with records being processed.
Consider the following:
- record N received
- record's identity saved to state store (for business-level deduplication)
- processing throws an exception, killing the node and ensuring no outgoing records are sent
- Node is restarted
- Offset was never updated, so record N is reprocessed
- State-store is reset to position N-1
- Record is reprocessed
- Node is restarted
- Record N is reprocessed (good)
- The state store has the state from the previous processing
- sad :(
In this case, the state-store causes the app to think the re-run message is a duplicate, when it obviously is not
n.b. to clear down the rocksdb storage, run make reset
In at least 3 consoles (in this order:)
- (Console 1)
make kafka
Spins up a docker-compose with Kafka and Zookeeper, configured for exactly-once - (Console 2)
make topology
The app will initialise its topics and await records - (Console 3)
make client
Posts a record to Kafka - just a Guid/Guid pair - the deduplication key - The topology will crash..
- (Console 2)
make topology
The app will re-process the message, but will recognise it already from last time...