Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB workflow_params table: store None instead of deleting/not storing params #5618

Merged
merged 5 commits into from
Jul 11, 2023

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Jul 5, 2023

Closes #5593

Discovered in #5592 where pause_workflow and resume_workflow were happening in the same operation (reload) so were getting committed in the same transaction. The workflow ended up paused in the DB, but unpaused in the workflow state.

So instead of deleting or not storing params, store them as None (which is an empty string for the TEXT type in sqlite). In the case of is_paused store it as a boolean ('0' or '1').

Full compatibility with pre-8.2.0 workflow databases is maintained. However, 8.2.0 workflow databases will not be compatible with earlier versions of Cylc.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • No dependency changes
  • Tests are included
  • CHANGES.md entry included if this is a change that can affect users
  • No docs change needed

@MetRonnie MetRonnie added the bug Something is wrong :( label Jul 5, 2023
@MetRonnie MetRonnie added this to the cylc-8.2.0 milestone Jul 5, 2023
@MetRonnie MetRonnie self-assigned this Jul 5, 2023
@MetRonnie MetRonnie force-pushed the workflow-params-db branch 2 times, most recently from d8de966 to d9ab35a Compare July 5, 2023 14:27
@@ -545,22 +545,18 @@ def select_broadcast_states(self, callback, sort=None):
for row_idx, row in enumerate(self.connect().execute(stmt)):
callback(row_idx, list(row))

def select_workflow_params(self, callback):
"""Select from workflow_params.
def select_workflow_params(self) -> Iterable[Tuple[str, str]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

cylc/flow/scheduler.py Outdated Show resolved Hide resolved
if self.options.holdcp is None:
elif key == self.workflow_db_mgr.KEY_UUID_STR:
self.uuid_str = value
LOG.info(f"+ workflow UUID = {value}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this is a negative change.

The way it was before, the string formatting only happened if the message was actually logged. With this change the string formatting happens, even if the message is not needed.

This is why the variables are provided as arguments to the logging function rather than '+ workflow UUID = %s' % value.

% formatting is still the documented advice.

https://docs.python.org/3/howto/logging.html#optimization

Copy link
Member Author

@MetRonnie MetRonnie Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting to know 👍 However this method is only called once per restart and reload so I don't think it is significant in this case. Also most of these are INFO level so will be logged in the vast majority of cases

Copy link
Member

@oliver-sanders oliver-sanders Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not asking for changes, but please leave these out of future tidying (there's no gain to converting them). Profiling is showing logging to be a surprisingly large CPU sink.

For a more modern syntax I think positional {} formatting should work the same e.g:

LOG.debug('{0} something', itask)

@MetRonnie MetRonnie force-pushed the workflow-params-db branch from d9ab35a to 8ab9f7f Compare July 5, 2023 15:31
@MetRonnie
Copy link
Member Author

MetRonnie commented Jul 6, 2023

Ah, note if you try to restart a workflow ran with this version of Cylc in an earlier version, you're going to get

CRITICAL - Workflow shutting down - int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

plus traceback. This is unavoidable

@MetRonnie MetRonnie added the db change Change to the workflow database structure label Jul 6, 2023
@oliver-sanders
Copy link
Member

This is ok for a minor release.

We aim to keep the DB forward compatible (with upgraders) within reason, but backward compatibility is not a concern, cylc play makes it hard to do this and marks the option as NOT RECOMMENDED.

@oliver-sanders oliver-sanders requested review from wxtim and removed request for hjoliver July 11, 2023 10:06
@wxtim wxtim merged commit e758cdc into cylc:master Jul 11, 2023
@MetRonnie MetRonnie deleted the workflow-params-db branch July 11, 2023 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is wrong :( db change Change to the workflow database structure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

workflow_params: potential bug when rapidly toggling parameters
3 participants