Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadDirectory searches deep in your filesystem #254

Closed
matthewmueller opened this issue Jul 13, 2020 · 2 comments
Closed

ReadDirectory searches deep in your filesystem #254

matthewmueller opened this issue Jul 13, 2020 · 2 comments

Comments

@matthewmueller
Copy link

matthewmueller commented Jul 13, 2020

Problem

I noticed that func (fs *realFS) ReadDirectory looks quite deep into the file system. I see this as a problem in two ways:

  • Performance will suffer if you have a large enough filesystem
  • Some folks may wonder why esbuild is searching so deep

How to Reproduce

1. If you do add fmt.Println(entryPath) after entryPath:

esbuild/internal/fs/fs.go

Lines 224 to 225 in 5c6b3ab

entryPath := filepath.Join(dir, name)

2. Create an index.js:

console.log("a");

3. go install ./cmd/esbuild
4. Run esbuild index.js.

You should see something like:

/home
/usr
/tmp
...
/Users
/Users/m/Go/src/github.com/golang
/Users/m/Go/src/github.com/matthewmueller/project
console.log("a");
@evanw
Copy link
Owner

evanw commented Jul 13, 2020

This is a consequence of some performance optimizations I've added. Unfortunately node's module resolution algorithm is quite heavy in terms of file system calls because it requires many "does this file/directory exist" checks to find each path (which was kind of a mistake but we're stuck with it).

This is even worse for bundlers than it is for node because while node just checks a few things (node_modules and index with the .js, .json, and .node extensions), bundlers have a lot more implicit extensions they need to check for including .jsx, .ts, .tsx, and any additional user-configured implicit extensions.

Querying for each potential file location using stat() was really expensive for esbuild, especially in certain environments such as WSL (Windows Subsystem for Linux). I don't have the specific performance numbers anymore, but reading the entries of each directory into memory once and then querying that map instead of doing a syscall was a big speedup.

And the reason esbuild reads all the way up to the root is that each cached directory info query depends on its parent directory. This is because esbuild does path resolution in parallel and it was convenient for having access to the info of a directory to also implicitly grant you access to the info of the parent directories without further locking and coordination overhead. In practice this was very little overhead because the parent directories of the project you're in are only read once and then cached for all queries.

It's possible that all of this could be improved. It would be good to know if people are encountering any performance issues with this approach. The approach was taken because it was a big performance boost, so any change to this approach will need to be carefully analyzed to avoid causing performance regressions.

@matthewmueller
Copy link
Author

Gotcha. Thanks for sharing the details!

For some reason I thought Node fixed this, but I just double-checked and it looks like this is following Node's require algorithm, "if you don't find it in node_modules, look in the parent".

I get it now, you're doing this resolution eagerly to avoid needing to do it every time.

require-resolve → NODE_DEBUG=module node index.js
MODULE 89198: looking for "/Users/m/Go/src/github.com/matthewmueller/hack/require-resolve/index.js" in ["/Users/m/.node_modules","/Users/m/.node_libraries","/usr/local/lib/node"]
MODULE 89198: load "/Users/m/Go/src/github.com/matthewmueller/hack/require-resolve/index.js" for module "."
MODULE 89198: Module._load REQUEST babel parent: .
MODULE 89198: looking for "babel" in ["/Users/m/Go/src/github.com/matthewmueller/hack/require-resolve/node_modules","/Users/m/Go/src/github.com/matthewmueller/hack/node_modules","/Users/m/Go/src/github.com/matthewmueller/node_modules","/Users/m/Go/src/github.com/node_modules","/Users/m/Go/src/node_modules","/Users/m/Go/node_modules","/Users/m/node_modules","/Users/node_modules","/node_modules","/Users/m/.node_modules","/Users/m/.node_libraries","/usr/local/lib/node"]

Bummer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants