Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ NEW: Allow user to filter/skip certain submissions #4

Open
mbercx opened this issue Sep 14, 2021 · 1 comment
Open

✨ NEW: Allow user to filter/skip certain submissions #4

mbercx opened this issue Sep 14, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@mbercx
Copy link
Member

mbercx commented Sep 14, 2021

Use cases

Currently, there is no way to indicate that you want to skip certain extras/filter nodes from the parent_group when submitting work chains. Here's two examples of use cases for this feature:

  • For the 3DCD runs, we typically only run structures up to a certain system size (i.e. number of sites in the unit cell).
  • Imagine that the work chains you want to submit depend on the outputs of a previous work chain. In this case you most likely only want to run work chains that have finished with exit status 0.

Possible approaches

Using skip_extras

Initially, the solution I had in mind was to add a skip_extras input argument, representing a function that takes the extras and returns True or False depending on whether a certain set of extras should be run. This would be added first as an input argument to the .submit_new_batch() method, and passed to the get_all_extras_to_submit() method:

    def submit_new_batch(self, dry_run=False, sort=True, sleep=1, skip_extras=None):
        """Submit a new batch of calculations, ensuring less than self.max_concurrent active at the same time.
        
        :param dry_run: simply return the extras that would be submitted.
        :param sort: sort the work chains by the extras before submissions.
        :param skip_extras: function that returns True in case a set of extras should be skipped, False otherwise.
        """
        to_submit = []
        extras_to_run = set(self.get_all_extras_to_submit(skip_extras)).difference(self._check_submitted_extras())
[...]

In the FromGroupSubmissionController.get_all_extras_to_submit(), for example, the function would be used to filter out the extras that didn't pass the test:

        if skip_extras is not None:
            results = [tuple(_) for _ in qbuild.all() if not skip_extras(_)]
        else:
            results = [tuple(_) for _ in qbuild.all()]

This means we have to add the extras that are required for this filtering, of course. Typically you can use the ones that uniquely define the work chain though. The above implementation is flawed in the sense that you have to rely on the index of the extra you are interested in when implementing the skip_extras method. But this can probably be fixed.

Using filters

Another straightforward approach in the case of the FromGroupSubmissionController (where both use cases stem from) is to have a filters inputs that is applied to the query to obtain the extras to submit:

qbuild = orm.QueryBuilder()
qbuild.append(orm.Group,
filters={'id': self.parent_group.pk},
tag='group')
qbuild.append(orm.Node,
project=extras_projections,
tag='process',
with_group='group')
results = qbuild.all()

This one doesn't require any specific extras to be present, and can deal with the second use case described above. It's a bit less general though, since these filters do not make sense for the BaseSubmissionController. Hence, adding filters as an input argument to the submit_new_batch() method is not preferable (unless we override this method in the FromGroupSubmissionController class, but that does introduce some code duplication. Perhaps it would be best to simply add these (optional) filters as an input argument to the constructor (e.g. parent_group_filters). We can even add a method to adjust these filters if needed, but typically a new submission controller is instantiated anyways.

Both

The two approaches have their use cases, so maybe we can just implement both of them?

@mbercx mbercx added the enhancement New feature or request label Sep 14, 2021
@mbercx
Copy link
Member Author

mbercx commented Apr 22, 2023

Partially addressed by 76eb0b0: the filters field of the FromGroupSubmissionController now allows the user to filter the nodes in the parent group that are considered for submission.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant