Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flow=all in a restarted completed workflow #5079

Closed
hjoliver opened this issue Aug 16, 2022 · 7 comments · Fixed by #5084
Closed

flow=all in a restarted completed workflow #5079

hjoliver opened this issue Aug 16, 2022 · 7 comments · Fixed by #5084
Assignees
Labels
Milestone

Comments

@hjoliver
Copy link
Member

hjoliver commented Aug 16, 2022

https://cylc.discourse.group/t/adding-new-task-to-succeeded-workflow-explicit-insertion/520/3

[UPDATE: supersedes #5072 ]

Scenario:

Result:

  • the default flow=all trigger results in flow=none because there are no existing flows to belong to (the workflow completed already under the original graph)

So to make the new tasks flow on you have to explicitly set a flow number: cylc trigger --flow=1.

This is correct behaviour according to our flow triggering rules, but in this particular case it won't be exactly obvious to users.

Question: should new task(s) run automatically - with the default trigger - in this scenario?

If yes, we'll have to treat triggering differently in a restarted completed workflow:

  • "all existing flows" -> "all previous flows"?
  • "all existing flows" -> "a new flow"? (this would result in a new flow even if triggering a past task)
@hjoliver hjoliver added the question Flag this as a question for the next Cylc project meeting. label Aug 16, 2022
@hjoliver hjoliver added this to the cylc-8.1.0 milestone Aug 16, 2022
@hjoliver
Copy link
Member Author

hjoliver commented Aug 17, 2022

Proposal: if there are no existing flows, triggered tasks should default to [UPDATE after discussion below]all the most recent previous flows.

This can only happen:

  • on restarting a completed workflow
  • and/or if there is nothing left but active no-flow tasks

(Note we can't use the most recent previous flow of the triggered task, because the triggered task could be newly added to the graph in which case it never ran before).

@dpmatthews
Copy link
Contributor

Proposal: if there are no existing flows, triggered tasks should default to all previous flows.

I don't think this is safe. When triggering historical tasks I think it would sometimes act like a new flow and sometimes not (depending on whether you had previously triggered a flow on a branch that didn't merge back with the main flow).
Needs further thought.
Treating this a new flow may be the best we can do?

@hjoliver
Copy link
Member Author

hjoliver commented Oct 9, 2022

When triggering historical tasks I think it would sometimes act like a new flow and sometimes not (depending on whether you had previously triggered a flow on a branch that didn't merge back with the main flow).

Not sure I follow that. If we give the triggered task ALL previous flow numbers, and it is a historical task (i.e. it already ran at least once before), then it will not flow on unless the previous run of it did not spawn children (e.g. because it failed). But at least the reason it won't flow on will be clearer (as opposed to getting --flow=none without asking for that).

Treating this a new flow may be the best we can do?

It seems to me this violates your stipulation (which I now agree with, given what we've ended up with as default behaviour!) that users should not be exposed to "new flows" unless they deliberately and explicitly opt in.

Needs further thought.

OK, perhaps we should aim for consistency across scheduler shutdown/restart. On restart, interrogate the DB to discover the final task that ran just before shutdown, and give the newly triggered task the flow numbers of that task. That way, the new task behaves the same whether triggered just before shutdown (getting all current flow numbers) or just after restart.

(Not sure why I didn't suggest this in the first place ... )

@dpmatthews
Copy link
Contributor

I'd interpreted "all previous flow numbers" to mean all flow numbers from the entire workflow.
Does it actually mean all previous flow numbers from the triggered task?
If so then I think that's OK.

On restart, interrogate the DB to discover the final task that ran just before shutdown, and give the newly triggered task the flow numbers of that task.

I think this suffers the same issue I was concerned about. The final task to run might have been a new flow on an isolated branch.

@hjoliver
Copy link
Member Author

The final task to run might have been a new flow on an isolated branch

Yes, but if you manually triggered the next task before shutdown - while that final off-piste task was still running - instead of after restart, it would get that same flow number anyway. Why is that a problem? Flows don't have to be contiguous (in the graph) when manual triggering is involved.

@dpmatthews
Copy link
Contributor

Yes, but if you manually triggered the next task before shutdown - while that final off-piste task was still running - instead of after restart, it would get that same flow number anyway.

Good point - you've convinced me!

@hjoliver
Copy link
Member Author

OK, removing the "question" label, we'll go with "most recent previous flow" for the reasons discussed.

@hjoliver hjoliver removed the question Flag this as a question for the next Cylc project meeting. label Oct 13, 2022
@oliver-sanders oliver-sanders modified the milestones: cylc-8.1.0, cylc-8.2.0 Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants