-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Make caching work correctly in async context #320
Conversation
Codecov Report
@@ Coverage Diff @@
## main #320 +/- ##
==========================================
+ Coverage 81.23% 81.31% +0.07%
==========================================
Files 13 14 +1
Lines 1503 1509 +6
Branches 553 554 +1
==========================================
+ Hits 1221 1227 +6
Misses 115 115
Partials 167 167
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to add a test to ensure this fixes the behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make some changes but it looks like I can't push to your branch so I left a few comments
4b8eccb
to
9b7fc6a
Compare
Okay, I revisited everything and instead of what I did before I extracted all file IO relevant functionality into a new class The change in analyze was also reverted as unnecessary because there is a check at the beginning of I think this is a lot cleaner now. With this now the only breaking change is the values in the file IO caches changed to Promises instead of the actual results. PS: I do not know why the github UI does not work on this branch. I can also not merge the base branch into this one via the UI. |
BREAKING CHANGE: The file-IO caches now contains promises instead of the actual result. Previously the cache was only populated after the function call succeed which in a async context was leading to multiple file IO or analyze tasks started for the same file at once. Caching the Promise instead of the result makes this problem go away.
Co-authored-by: Steven <[email protected]>
Co-authored-by: Steven <[email protected]>
@@ -137,15 +128,9 @@ export class Job { | |||
}, analysis === true ? {} : analysis); | |||
} | |||
|
|||
this.fileCache = cache && cache.fileCache || new Map(); | |||
this.statCache = cache && cache.statCache || new Map(); | |||
this.symlinkCache = cache && cache.symlinkCache || new Map(); | |||
this.analysisCache = cache && cache.analysisCache || new Map(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should analysisCache also go in the CachedFileSystem
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not used there and right now it is only used in the job class.
If we move the cache we would need to move the analysis functionality there too, which I'm not sure fits into CachedFileSystem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thanks 🎉
It seems because the fork I create is in an organization, GitHub does not allow edits from maintainers. 😒https://github.com/orgs/community/discussions/5634 I merged the main branch, hopefully that allows merging the PR :) |
🎉 This PR is included in version 0.22.6 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Right now the cache was only populated after the call to the fs API succeed, which in an async context was leading to multiple file IO or analyze tasks started for the same file at once. This is because both file IO are async and so until the fs call finished other calls for the same file can still come in.
To fix this I extracted all file IO calls into a separate class that wraps everything correctly and instead of caching the actual result, now a Promise is cached. So the cache now always needs to be awaited.
As the values in the cache change this is definitely a breaking change. Something like this:
I know this is quite a big change, just let me know what you think and if I should change anything. :)
In my personal example the number of calls to analyze went from ~2400 to ~1600 and the runtime from ~3s to ~2.7s (which includes other work than nft)