You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some instances, it is necessary to store large text (eg. CSV) or binary (eg. HDF5) files which are appended to over time. This means that versioning quickly becomes impractical as DVC treats each small addition as a completely new file and stores them independently.
A better system would be to store larger files as a collection of blobs (and therefore dvc metadata would save a collection of md5 hashes corresponding to each blob). By using a collection rather than a single blob, the majority of the file is not duplicated every time a new change is made, thus saving a significant amount of space.
The text was updated successfully, but these errors were encountered:
In some instances, it is necessary to store large text (eg. CSV) or binary (eg. HDF5) files which are appended to over time. This means that versioning quickly becomes impractical as DVC treats each small addition as a completely new file and stores them independently.
A better system would be to store larger files as a collection of blobs (and therefore dvc metadata would save a collection of md5 hashes corresponding to each blob). By using a collection rather than a single blob, the majority of the file is not duplicated every time a new change is made, thus saving a significant amount of space.
The text was updated successfully, but these errors were encountered: