Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduces flushes during commit/checkout #1381

Merged
merged 2 commits into from
Dec 9, 2021
Merged

Reduces flushes during commit/checkout #1381

merged 2 commits into from
Dec 9, 2021

Conversation

AbhinavTuli
Copy link
Contributor

🚀 🚀 Pull Request

Checklist:

  • My code follows the style guidelines of this project and the Contributing document
  • I have commented my code, particularly in hard-to-understand areas
  • I have kept the coverage-rate up
  • I have performed a self-review of my own code and resolved any problems
  • I have checked to ensure there aren't any other open Pull Requests for the same change
  • I have described and made corresponding changes to the relevant documentation
  • New and existing unit tests pass locally with my changes

Changes

Makes commits and checkout to new branches faster

@@ -403,7 +403,8 @@ def commit(self, message: Optional[str] = None) -> None:
str: the commit id of the stored commit that can be used to access the snapshot.
"""
commit_id = self.version_state["commit_id"]
try_flushing(self)
Copy link
Contributor Author

@AbhinavTuli AbhinavTuli Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already flush after copying metas inside commit function in version_control.py so this wasn't needed

@@ -403,7 +403,8 @@ def commit(self, message: Optional[str] = None) -> None:
str: the commit id of the stored commit that can be used to access the snapshot.
"""
commit_id = self.version_state["commit_id"]
try_flushing(self)
initial_autoflush = self.storage.autoflush
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always disable autoflush during commit and enable it back at the end if required

@@ -426,7 +428,8 @@ def checkout(self, address: str, create: bool = False) -> str:
Returns:
str: The commit_id of the dataset after checkout.
"""
try_flushing(self)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove this call as we flush during copy_metas during checkout in version_control.py if a new branch is being created, otherwise if we're checking out to an existing branch/commit, explicit flushes have been added now.

@codecov
Copy link

codecov bot commented Dec 8, 2021

Codecov Report

Merging #1381 (8cc0a57) into main (8ca1a13) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1381      +/-   ##
==========================================
- Coverage   92.83%   92.79%   -0.05%     
==========================================
  Files         174      174              
  Lines       12749    12757       +8     
==========================================
+ Hits        11836    11838       +2     
- Misses        913      919       +6     
Flag Coverage Δ
unittests 92.79% <100.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
hub/core/dataset/dataset.py 94.02% <100.00%> (+0.05%) ⬆️
hub/util/version_control.py 96.31% <100.00%> (+0.07%) ⬆️
hub/integrations/tests/test_pytorch_dataloader.py 95.69% <0.00%> (-2.16%) ⬇️
hub/integrations/pytorch/dataset.py 91.49% <0.00%> (-1.62%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ca1a13...8cc0a57. Read the comment docs.

@aliubimov aliubimov changed the base branch from main to fix/fast-query December 8, 2021 18:51
@AbhinavTuli AbhinavTuli changed the base branch from fix/fast-query to main December 8, 2021 19:34
@AbhinavTuli AbhinavTuli merged commit efc75f7 into main Dec 9, 2021
@AbhinavTuli AbhinavTuli deleted the vc/speedup branch December 9, 2021 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants