Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete tasks on stanbdy cluster for workflows that don't exist anymore #6549

Conversation

fimanishi
Copy link
Member

What changed?
Allow for completion of tasks on standby cluster if the workflow doesn't exist anymore

Why?
If the retention period after workflow completion has passed, the workflow will be deleted from the database. If tasks from these workflows were not already completed (because they were already there before the implementation of the feature) they will block completion of future tasks.

How did you test it?
Tested locally and will validate again in a staging environment.

Potential risks
Before tasks are pushed to the matching, it's checked to see if the workflow execution is present. Because we are checking for a specific error here, I don't think there is risk added.

Release notes

Documentation Changes

@fimanishi fimanishi marked this pull request as ready for review December 9, 2024 21:33
Copy link
Member

@davidporter-id-au davidporter-id-au left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. Suggested we roll it out carefully. Forseeably a problem with querying the executions table without consistency ALL might be a little unsafe to remove tasks on, but the risk seems fairly slight.

@fimanishi fimanishi requested a review from Groxx as a code owner December 13, 2024 17:46
errTaskNotStarted = errors.New("task not started")
errDomainIsActive = &DomainIsActiveInThisClusterError{Message: "domain is active"}
historyServiceOperationRetryPolicy = common.CreateTaskCompleterRetryPolicy()
errWorkflowExecutionInfoIsNil = errors.New("workflow execution info is nil")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urgh, not for this PR, but we probably want to move away from this pattern:

  • the unexported errors we can't check against
  • I think, generally, exporting a type error is preferable than a static var error (ie
type ErrDomainNotActive struct {
  Message string 
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will refactor in a later PR

Copy link
Member

@davidporter-id-au davidporter-id-au left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@fimanishi fimanishi merged commit c2a8701 into cadence-workflow:master Dec 13, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants