-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ihs: catch NoHostsError #5195
ihs: catch NoHostsError #5195
Conversation
f267d24
to
31857e9
Compare
31857e9
to
6979613
Compare
02cd1b8
to
e0fca23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Read code
- Checked out, run new tests
- Proof of pudding test, played with code.
caplog.set_level(logging.WARN, 'cylc') | ||
host_stats, data = _get_metrics(['not-a-host'], None) | ||
# a warning should be logged | ||
assert len(caplog.records) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any warning at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup.
Note: This warning could only come from _get_metrics
which is a small simple function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have found this does not address Dave's NoHostsError
during shutdown
Traceback (most recent call last):
File "~/cylc-flow/cylc/flow/scheduler.py", line 1742, in _shutdown
self.proc_pool.terminate()
File "~/cylc-flow/cylc/flow/subprocpool.py", line 345, in terminate
self.process()
File "~/cylc-flow/cylc/flow/subprocpool.py", line 224, in process
self._proc_exit(
File "~/cylc-flow/cylc/flow/subprocpool.py", line 206, in _proc_exit
self._run_command_exit(
File "~/cylc-flow/cylc/flow/subprocpool.py", line 529, in _run_command_exit
res = _run_callback(callback_255, callback_255_args)
File "~/cylc-flow/cylc/flow/subprocpool.py", line 481, in _run_callback
callback(ctx, *args_)
File "~/cylc-flow/cylc/flow/task_job_mgr.py", line 820, in _poll_task_jobs_callback_255
self._manip_task_jobs_callback(
File "~/cylc-flow/cylc/flow/task_job_mgr.py", line 806, in _manip_task_jobs_callback
summary_callback(workflow, itask, ctx, line)
File "~/cylc-flow/cylc/flow/task_job_mgr.py", line 829, in _poll_task_job_callback_255
self.poll_task_jobs(workflow, [itask])
File "~/cylc-flow/cylc/flow/task_job_mgr.py", line 210, in poll_task_jobs
self._run_job_cmd(
File "~/cylc-flow/cylc/flow/task_job_mgr.py", line 948, in _run_job_cmd
host = get_host_from_platform(
File "~/cylc-flow/cylc/flow/platforms.py", line 516, in get_host_from_platform
raise NoHostsError(platform)
(line numbers here are as on master not this branch)
e0fca23
to
16fafc9
Compare
Rebased, added a try/except for kill & poll to address the reported traceback. |
Hmm, with this I now get new traceback for an invalid ssh config
and the workflow seems to hang. Might be why |
* Catch NoHostsError in the code it can occur in. * Attempt to handle the error in these contexts. * Mark the functions where the error can crop up for future warning. * Fix job log retrieval retries.
16fafc9
to
dc15ce4
Compare
Sorted. I had made a mistake with the error handling in one place, all tests should now be passing. |
Co-authored-by: Ronnie Dutta <[email protected]>
d3a7b49
to
818887a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, tested, no problems found 👍
def test_get_metrics_no_hosts_error(caplog): | ||
"""It should handle SSH errors. | ||
|
||
If a host is not contactable then it should be shipped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"shipped/skipped"? (Obvious enough, I guess!)
Tests didn't run on last commit? Kicking... |
Passed. Patch-diff coverage only 75%, but it'll do for now. |
Check List
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
andconda-environment.yml
.CHANGES.md
entry included if this is a change that can affect users