[Bug]: Subscription may not be cleaned by datacoord under corner cases, causing quota exceeded in milvus #15371
Closed
1 task done
Labels
kind/bug
Issues or changes related a bug
stale
indicates no udpates for 30 days
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
Milestone
Is there an existing issue for this?
Environment
Current Behavior
after #15353 is in, subscription will be cleaned up by datacoord.
However, if a node crash after a balance task, previous balance task subscription source may be deleted and there is nowhere to find the previous nodeId, causing subscription created from the crashed node can not be unsubscribed and leakage.
If this corner case happened, pulsar backlog will not be consumed due to the leaked subscription and system can not be produced by quota exceeded exception.
Expected Behavior
multiple task handled in parallel should be handled gracefully.
I would suggest add more states in etcd. A task life cycle in channel_manager should be as follow:
Steps To Reproduce
No response
Anything else?
you can fix it by using pulsar admin to cleanup exist subscriptions.
The text was updated successfully, but these errors were encountered: