-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document checkpoints without make_checkpoint() #2292
Comments
This is just using a regular output as a checkpoint. I guess the distinction is that in order for the "resume" feature in
workflow (with the results stored in an experiment ref instead of a regular git branch). Maybe we should just consider having an explicit flag to extend ("resume") an existing experiment branch (that may or may not have edit: although in this case |
Exactly, which makes it easier to grasp for existing DVC users and more consistent with the rest of DVC (no need to inject DVC API functions into the user code). Not that |
So If
to run the experiments continuously. BTW, yes, this works as I expected but quitting from the loop is very difficult. I had to close tmux pane. 😅 |
Yes, DVC will always generate a final checkpoint commit after running the command (assuming the workspace state has actually changed and there is actually changes to commit) |
Yeah I didn't even realize this was possible. Sure!
But the experiments (checkpoints) are added into a branch this way, which is quite different from regular experiments. Kind of the main difference with checkpoints!
BTW that connects with iterative/dvc/issues/5608 |
We also probably need an example in the docs and/or in https://github.com/iterative/dvc-checkpoints-mnist that shows how to use signal files to generate checkpoints. |
Meaning withouot |
I mean manually doing the steps in https://dvc.org/doc/api-reference/make_checkpoint#description. Torn on whether to do it in Python for consistency or in another language to show how that works. Maybe we can start with Python and possibly translate into another language later. |
I put an example in https://github.com/iterative/dvc-checkpoints-mnist/tree/signal_file. Next step is probably to reference this repo in the docs or develop a more robust tutorial out of this or another scenario. Seems like checkpoints in general are not featured that prominently yet, and the different ways to implement them are not laid out clearly. Any thoughts on an approach here? |
Agree. Just not sure what the priority is. We can wait and see if people seem to need this guidance (those who don't find the |
IMHO a first step would be to reference https://github.com/iterative/dvc-checkpoints-mnist in the docs (I think it's only mentioned in |
I'm worried that users aren't very aware of checkpoints or how to use them until we better highlight and document them. As a first step, what about referencing the mnist examples in https://dvc.org/doc/user-guide/experiment-management#checkpoints-in-source-code? |
Makes sense to wait a little before assigning a priority.
Sure, we can add a paragraph. Or would it be more useful in #2373? (I can contribute to the branch). |
UPDATE: for now please see #2381 and iterative/dvc-checkpoints-mnist#3 |
#2381 seems sufficient for now. Closing this one. |
Our documentation so far in https://dvc.org/doc/command-reference/exp/run (and maybe elsewhere?) assumes that checkpoints only work with
make_checkpoint()
or a signal file. However, checkpoints still work withoutmake_checkpoint()
if the stage is designed to make a single checkpoint instead of multiple checkpoints. The final output will still be saved for the following iteration as long as it is marked ascheckpoint: true
in thedvc.yaml
.This would be great to document because:
See https://github.com/iterative/dvc-checkpoints-mnist/tree/python_agnostic for an example.
@pmrowla @jorgeorpinel @dmpetrov
The text was updated successfully, but these errors were encountered: