Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of modified files when creating a child dataset with external links on a zipped parent #1323

Open
Phanty133 opened this issue Sep 10, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@Phanty133
Copy link

Describe the bug

I've come across a bug when creating a new child dataset that overwrites parent dataset files. The new files in the child are added as external files, but the files in the parent are zipped. As a result, the modified files are both in link entries and file entries, but due to how get_local_copy() is currently implemented, it always overwrites the newer external files with symlinks to the parent dataset files. The weird file/link issue is also reflected in the dataset dashboard file/link and files changed counters. I would've expected that it'd show either that files added to be 1 and modified 1 or added to be 2 and modified 0, but somehow it ends up at added 2 and modified 1.
image

To reproduce

Code to reproduce: Pastebin

Expected behaviour

I expect that parent symlinking is handled correctly for those zipped files that aren't overwitten by external files, and that an external file overwriting a zipped file wouldn't get overwritten with a symlink to the older parent dataset file.

Environment

  • Server type: Self-hosted
  • ClearML SDK Version: 1.16.4
  • ClearML Server Version: 1.16.0-494
  • Python Version: 3.10.12
  • OS: Linux

Related Discussion

Original slack message

@Phanty133 Phanty133 added the bug Something isn't working label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant