Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Make caching work correctly in async context #320

Merged
merged 4 commits into from
Dec 14, 2022

Conversation

danez
Copy link
Contributor

@danez danez commented Nov 1, 2022

Right now the cache was only populated after the call to the fs API succeed, which in an async context was leading to multiple file IO or analyze tasks started for the same file at once. This is because both file IO are async and so until the fs call finished other calls for the same file can still come in.

To fix this I extracted all file IO calls into a separate class that wraps everything correctly and instead of caching the actual result, now a Promise is cached. So the cache now always needs to be awaited.

As the values in the cache change this is definitely a breaking change. Something like this:

const cache = {};
await nodeFileTrace([file], {
    cache,
);

-const cacheValue = cache.fileCache.get(file);
+const cacheValue = await cache.fileCache.get(file);

I know this is quite a big change, just let me know what you think and if I should change anything. :)

In my personal example the number of calls to analyze went from ~2400 to ~1600 and the runtime from ~3s to ~2.7s (which includes other work than nft)

@danez danez requested review from ijjk and styfle as code owners November 1, 2022 17:42
src/node-file-trace.ts Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Nov 3, 2022

Codecov Report

Merging #320 (fa4b781) into main (047f030) will increase coverage by 0.07%.
The diff coverage is 89.65%.

@@            Coverage Diff             @@
##             main     #320      +/-   ##
==========================================
+ Coverage   81.23%   81.31%   +0.07%     
==========================================
  Files          13       14       +1     
  Lines        1503     1509       +6     
  Branches      553      554       +1     
==========================================
+ Hits         1221     1227       +6     
  Misses        115      115              
  Partials      167      167              
Impacted Files Coverage Δ
src/fs.ts 88.46% <88.46%> (ø)
src/node-file-trace.ts 91.16% <100.00%> (+0.85%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@styfle styfle changed the title fix!: Make caching work correctly in async context fix: Make caching work correctly in async context Nov 3, 2022
Copy link
Member

@styfle styfle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to add a test to ensure this fixes the behavior?

@danez
Copy link
Contributor Author

danez commented Nov 7, 2022

I added tests, which fail on the main branch.
The tests ensure, that all the cached functions are called only once for each file.

On main:
Screenshot 2022-11-07 at 16 19 52

On this branch:
Screenshot 2022-11-07 at 16 20 07

test/unit.test.js Outdated Show resolved Hide resolved
src/node-file-trace.ts Outdated Show resolved Hide resolved
@danez danez requested review from styfle and removed request for ijjk November 23, 2022 10:51
src/node-file-trace.ts Outdated Show resolved Hide resolved
src/node-file-trace.ts Outdated Show resolved Hide resolved
Copy link
Member

@styfle styfle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make some changes but it looks like I can't push to your branch so I left a few comments

@danez danez force-pushed the cache-fix branch 2 times, most recently from 4b8eccb to 9b7fc6a Compare December 6, 2022 14:10
@danez
Copy link
Contributor Author

danez commented Dec 6, 2022

Okay, I revisited everything and instead of what I did before I extracted all file IO relevant functionality into a new class CachedFileSystem.
This makes the diff in node-file-trace a lot smaller, and also removes the need for the _internal variables there.

The change in analyze was also reverted as unnecessary because there is a check at the beginning of emitDependency that ensures it is only called once per file. The new tests also ensure it stays that way.

I think this is a lot cleaner now.

With this now the only breaking change is the values in the file IO caches changed to Promises instead of the actual results.

PS: I do not know why the github UI does not work on this branch. I can also not merge the base branch into this one via the UI.

@danez danez requested a review from styfle December 6, 2022 14:14
BREAKING CHANGE: The file-IO caches now contains promises instead of the actual result.

Previously the cache was only populated after the function call succeed which in a async
context was leading to multiple file IO or analyze tasks started for the same file
at once. Caching the Promise instead of the result makes this problem go away.
src/fs.ts Outdated Show resolved Hide resolved
src/fs.ts Outdated Show resolved Hide resolved
danez and others added 2 commits December 14, 2022 17:09
Co-authored-by: Steven <[email protected]>
Co-authored-by: Steven <[email protected]>
@@ -137,15 +128,9 @@ export class Job {
}, analysis === true ? {} : analysis);
}

this.fileCache = cache && cache.fileCache || new Map();
this.statCache = cache && cache.statCache || new Map();
this.symlinkCache = cache && cache.symlinkCache || new Map();
this.analysisCache = cache && cache.analysisCache || new Map();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should analysisCache also go in the CachedFileSystem class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used there and right now it is only used in the job class.

If we move the cache we would need to move the analysis functionality there too, which I'm not sure fits into CachedFileSystem

Copy link
Member

@styfle styfle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks 🎉

@styfle styfle enabled auto-merge (squash) December 14, 2022 16:30
@styfle styfle requested a review from ijjk December 14, 2022 18:59
@danez
Copy link
Contributor Author

danez commented Dec 14, 2022

It seems because the fork I create is in an organization, GitHub does not allow edits from maintainers. 😒https://github.com/orgs/community/discussions/5634

I merged the main branch, hopefully that allows merging the PR :)

@styfle styfle merged commit 1a7f083 into vercel:main Dec 14, 2022
@github-actions
Copy link

🎉 This PR is included in version 0.22.6 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants