Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: allow monkey-patching internal fs binding #39711

Closed
wants to merge 1 commit into from

Conversation

zcbenz
Copy link
Contributor

@zcbenz zcbenz commented Aug 9, 2021

For third party embedders that implement virtual filesystems, they must override methods of internalBinding('fs') to monkey-patch the module system, and to achieve so they must patch Node. Here are a few of them:

Pkg:
https://github.com/vercel/pkg-fetch/blob/main/patches/node.v16.4.1.cpp.patch#L214-L234
Electron:
https://github.com/electron/electron/blob/063ac1971244e9d0411a6046e4840c975e38f726/patches/node/refactor_allow_embedder_overriding_of_internal_fs_calls.patch
Yode:
yue/node@7240685

Since every embedder is doing this, I think it makes sense to have the change in Node.

/cc @nodejs/embedders

@nodejs-github-bot nodejs-github-bot added the needs-ci PRs that need a full CI run. label Aug 9, 2021
@codebytere codebytere added the embedding Issues and PRs related to embedding Node.js in another project. label Aug 9, 2021
@aduh95 aduh95 requested a review from bmeck August 9, 2021 08:51
@addaleax
Copy link
Member

addaleax commented Aug 9, 2021

Two things:

  1. If this lands, a PR to revert this would also be approved, since it only affects internals. You need to at least add a test to prevent that.
  2. This is not a great long-term solution, since it still means that embedders rely on internals that can break at any second. It’s probably worth spending some time on thinking about what a proper API for this would look like.

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

@zcbenz
Copy link
Contributor Author

zcbenz commented Aug 9, 2021

I think I can write a test like this:

const fs = require('fs')
const fsBinding = process.binding('fs')

override(fs.readFileSync)
override(fsBinding.internalModuleStat)
override(fsBinding.internalModuleReadJSON)

assert(require('/virtual/path'), {...})

But it would involve some internals and I can see it might cause headaches when someone wants to refactor related code in future. Would a test like this be fine?

I had tried to find a better solution and the virtual filesystem code had been refactored for a few times, but I could not figure out a better API other than monkey-patching fs.

I understand this change is fragile and might be broken at anytime. But I still create a pull request because it hurts nothing and will make some people's work easier, I don't really plan to put the burden of maintaining a virtual filesystem API on Node.js.

@addaleax
Copy link
Member

addaleax commented Aug 9, 2021

Would a test like this be fine?

I can’t speak for everyone here, but it’s about what I had in mind, yeah.

I don't really plan to put the burden of maintaining a virtual filesystem API on Node.js.

Well … you specifically pointed out that this is actually a common use case for embedders :) I do think it’s worth thinking about what this API could look like and how Node.js could maintain it (e.g.: Allow hooking out all libuv fs calls? All libuv calls altogether?).

@merceyz
Copy link
Member

merceyz commented Aug 9, 2021

Something similar was attempted for the ESM to CJS code path in #39513 but it seems monkey-patching isn't something Node wants to happen but rather have APIs for use cases like this.

Ref #33423 which is about these specific methods

@addaleax
Copy link
Member

addaleax commented Aug 9, 2021

Yeah, but I could imagine that providing lower-level hooks isn’t something that’s quite out of the question – that is, not JS monkey-patching, but an actual C++ embedder API for this. The libuv API is very stable, so that’s an obvious candiate; instead of Node.js calling the methods directly, it could call functions from a user-provided table of functions that implement the libuv interface (and which would default to the actual libuv methods), for example.

@bmeck
Copy link
Member

bmeck commented Aug 9, 2021

I think a libuv indirection layer might be nice. we do have talks about problems monkey patching in Diagnostics, Loaders, Modules, and other working groups so having an official API / replacement via replacement instead seems good.

@zcbenz
Copy link
Contributor Author

zcbenz commented Aug 10, 2021

So reading from #39513 and #33423 the internals of Node are not supposed to be affected by monkey-patching fs, and I think it makes sense.

On the road to C++ embedder API, how do you think about something like this?

// node.h
class UvDelegate {
 public:
  explicit UvDelegate(node::Environment* env);

  virtual int uv_fs_open(...);
  virtual int uv_fs_close(...);
};

// node.cc
void InternalModuleReadJSON() {
  ...
  int fd = env->delegate()->uv_fs_open();
  ...
}

// user.cc
class EmbedderUvDelegate : node::UvDelegate {
  int uv_fs_open(...) override;
  int uv_fs_close(...) override;
};

node::CreateEnvironment(..., delegate, ...);

or something simpler:

// node.h
struct UvAPIs {
  UvFsOpenFunc uv_fs_open = uv_fs_open;
  UvFsCloseFunc uv_fs_close = uv_fs_close;
  ...
};

// user.cc
UvAPIs uv_apis;
uv_apis.uv_fs_open = ...;
uv_apis.uv_fs_close = ...;

node::CreateEnvironment(..., uv_apis, ...);

It should be enough for Electron to implement virtual filesystem, but I'm not sure whether it would be useful for other embedders, for example I don't know if the use case of #39513 could be meet with this API. /cc @merceyz

@zcbenz
Copy link
Contributor Author

zcbenz commented Aug 10, 2021

Also @jesec since pkg would also be a potential user of the API.

@addaleax
Copy link
Member

On the road to C++ embedder API, how do you think about something like this?

Yeah, that’s about what I had in mind. There are a few tricky things here from an implementation perspective, e.g. one the one hand, uv_fs_t is a fixed-size, fixed-layout structure – should we give up one of the reserved fields for embedder use? and on the other hand, how do we do this in a sufficiently ABI-stable way? But the general idea is 👍

@zcbenz
Copy link
Contributor Author

zcbenz commented Aug 11, 2021

I'm closing this PR and will work on a C++ API later.

@zcbenz zcbenz closed this Aug 11, 2021
@merceyz
Copy link
Member

merceyz commented Aug 11, 2021

for example I don't know if the use case of #39513 could be meet with this API. /cc @merceyz

If it can be registered at runtime from JS then it could work for us, you can look at it like jest mocking the filesystem, it doesn't ship it's own version of Node but changes the current environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embedding Issues and PRs related to embedding Node.js in another project. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants