Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Persistent file storage from Hermes #683

Open
cubrink opened this issue May 29, 2024 · 2 comments
Open

Feature Request: Persistent file storage from Hermes #683

cubrink opened this issue May 29, 2024 · 2 comments

Comments

@cubrink
Copy link

cubrink commented May 29, 2024

Requested feature:

Persistent file storage from Hermes

Problem:

We are using Hermes as the IO engine for ADIOS2 in scientific workflows.
Whenever our workflow finishes, Hermes has no means to store the collected data to disk and therefore the data is lost.
As the raw data from the workflow is not saved it is not possible to perform post-hoc analysis.

As these workflows can be computationally expensive it is not practical to re-run experiments to regenerate the data.
Further, if the workflow is of a stochastic or chaotic nature, it may not be possible to replicate previous runs.

Proposed solution:

When Hermes is preparing to clear data, add an option so that this data can be written to disk.

For example, the default engine for ADIOS2 is the bp5 engine.
Hermes could be configured to write to disk data that it no longer is using with the bp5 format.
At the end of the workflow when Hermes exits, an end user would have access to the raw data in bp5 format.
This way the user gets the benefit of using the Hermes engine while in the workflow but also can access the results of the experiment after it was run.

@lukemartinlogan
Copy link
Collaborator

This will require adding ADIOS2 to the data stager in Hermes. I expect this to take me 2 days for implementing and debugging, since I'll have to revisit how ADIOS2 works. I can have a version of this for the Monday meeting.

@lukemartinlogan
Copy link
Collaborator

This is more of an involved process than I originally anticipated. ADIOS2 requires BeginStep and EndStep which is difficult to combine with asynchronous data staging. By the time the stager is activated, multiple steps could be present which makes this more complicated. There are two main questions I'm experimenting with:

  1. Can BeginStep/EndStep be called across processes, but different processes are at different steps?
  2. Can BeginStep/EndStep be called out-of-order. E.g., BeginStep(16), BeginStep(15), BeginStep(17)?

If the answer to either of these questions is no, it will require some changes to the Hermes staging system, which I suspect will take at least a week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants