Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reload broken in compatibility mode #5102

Closed
dpmatthews opened this issue Aug 25, 2022 · 3 comments · Fixed by #5104
Closed

Reload broken in compatibility mode #5102

dpmatthews opened this issue Aug 25, 2022 · 3 comments · Fixed by #5104
Assignees
Labels
bug Something is wrong :(
Milestone

Comments

@dpmatthews
Copy link
Contributor

With this suite.rc:

[scheduling]
    [[graph]]
        R1 = "foo => bar"
[runtime]
    [[foo]]
        script = "echo Hello; false"
    [[bar]]

Play the workflow and let it stall.
Then reload.
Then trigger foo.
I get the following in the scheduler log:

2022-08-25T18:57:43+01:00 INFO - Processing 1 queued command(s)
2022-08-25T18:57:43+01:00 INFO - [1/foo failed job:01 flows:1] => waiting
2022-08-25T18:57:43+01:00 INFO - Command succeeded: force_trigger_tasks(['1/foo'], flow=['all'], flow_wait=False, flow_descr=None)
2022-08-25T18:57:43+01:00 INFO - [1/foo waiting job:02 flows:1] => preparing
2022-08-25T18:57:43+01:00 INFO - [1/foo preparing job:02 flows:1] host=XXX
2022-08-25T18:57:43+01:00 WARNING - stall timer stopped
2022-08-25T18:57:44+01:00 WARNING - Unhandled jobs-submit output: 2022-08-25T18:57:44+01:00|1/foo/02|0|78545
2022-08-25T18:57:44+01:00 WARNING - ('1', 'foo', '02')
2022-08-25T18:57:44+01:00 WARNING - Unhandled jobs-submit output: 2022-08-25T18:57:44+01:00|1/foo/02|[STDOUT] 78545
2022-08-25T18:57:44+01:00 WARNING - ('1', 'foo', '02')
2022-08-25T18:57:44+01:00 ERROR - [jobs-submit cmd] cylc jobs-submit --clean-env --path=/bin --path=/usr/bin --path=/usr/local/bin --path=/sbin --path=/usr/sbin --path=/usr/local/sbin -- '$HOME/cylc-run/test/reload/run1/log/job' 1/foo/02
    [jobs-submit ret_code] 1
    [jobs-submit out] 2022-08-25T18:57:44+01:00|1/foo/01|1
2022-08-25T18:57:44+01:00 INFO - [1/foo failed job:01 flows:1] (internal)submission failed at 2022-08-25T18:57:44+01:00
2022-08-25T18:57:44+01:00 CRITICAL - [1/foo failed job:01 flows:1] submission failed
2022-08-25T18:57:44+01:00 INFO - [1/foo failed job:01 flows:1] => submit-failed
2022-08-25T18:57:46+01:00 INFO - [1/foo preparing job:02 flows:1] (received)started at 2022-08-25T18:57:45+01:00

Something is going badly wrong with the the job submission.
Despite the warnings, job 02 does get submitted and runs.
However, if I modify the task config and reinstall before doing the reload, job 02 does not use the updated config so it's very broken.

The problem does not occur if I:

  1. rename suite.rc to flow.cylc
  2. Change the graph to R1 = "foo"
@dpmatthews dpmatthews added the bug Something is wrong :( label Aug 25, 2022
@dpmatthews dpmatthews added this to the 8.0.2 milestone Aug 25, 2022
@hjoliver
Copy link
Member

hjoliver commented Aug 26, 2022

It looks like we're ending up with two instances of the task proxy - job:01 and job:02 ... [sort of]

@hjoliver hjoliver self-assigned this Aug 26, 2022
@hjoliver
Copy link
Member

Tentative fix coming.

@hjoliver hjoliver mentioned this issue Aug 26, 2022
8 tasks
@oliver-sanders
Copy link
Member

oliver-sanders commented Aug 26, 2022

It looks like we're ending up with two instances of the task proxy - job:01 and job:02 ... [sort of]

That's exactly the same cause of: #4974

This reload mechanism is quite problematic. Longer term I've suggested not replacing the TaskProxy instance, just the TaskDef it uses.

@MetRonnie MetRonnie linked a pull request Aug 30, 2022 that will close this issue
8 tasks
@oliver-sanders oliver-sanders modified the milestones: cylc-8.0.2, cylc-8.0.3 Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :(
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants