Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Conditionally run npm install in tmp directory #8

Closed
wants to merge 8 commits into from

Conversation

kylebjordahl
Copy link

First off, let me say thank you for this elegant and clever solution! Having used both pkg and nexe for years, I always thought it could be as simple as this, so kudos for just getting it done. By the look of the README, you've clearly done your homework on this, and I'm excited to offer a bit of my own suggestion with this PR.

Why

I'm currently battling an issue with pkg involving some native dependencies, etc. (super common story, I'm sure). This code is part of a medium-sized Nx monorepo, which keeps things organized, but has the downside of having one large node_modules folder for multiple apps. Using their new build tool, it's possible to generate a package.json file as a build artifact which includes only the required dependencies for the particular app or library; while this is great, I don't love the idea of having to run npm install on a folder in our build artifacts tree before I bundle it with caxa.

Solution

Just before running npm prune check for the existence of a node_modules folder in the temp directory, and if one is not found, then check for either a package.json or package-lock.json, running npm ci for package-lock, and standard npm i for package. npm ci is the preferred option, as it creates a (more) deterministic environment.

Cheers, and thanks again for a great solution to this problem!

src/index.ts Outdated
// prefer using package-lock, if available
try {
if (await fs.pathExists(packageLockPath)) {
await execa("npm", ["ci"], { cwd: appDirectory }).catch();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await execa("npm", ["ci"], { cwd: appDirectory }).catch();
await execa("npm", ["ci"], { cwd: appDirectory, env: { NODE_ENV: "production" } }).catch();

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why did you add a .catch() here but not on the npm i below?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the catch (pun intended) - it was from a previous approach I had which didn't have the try blocks.

I'm actually intentionally NOT limiting it to production, because one of the example repos runs tsc as part of its prepare script, which fails without the dev dependency of typescript.

But we still run npm prune to clear out the production dependencies before bundling, so this feels acceptable to me. Perhaps we add an optional flag to force production-only installation if you know you don't need the option of dev dependencies for this step?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kylebjordahl Ah, I see! Makes sense. I thought of this as an optimization, but you're right, there are situations in which installing it all is necessary. I'd say to just leave it the way you did.

}
} catch (err) {
try {
await execa("npm", ["i"], { cwd: appDirectory });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await execa("npm", ["i"], { cwd: appDirectory });
await execa("npm", ["i"], { cwd: appDirectory, env: { NODE_ENV: "production" } });

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here as above (#8 (comment))

@kylebjordahl
Copy link
Author

Just realized I forgot to put this pr on a branch, so its getting my continued updates. If that's ok, I'm fine with leaving it, but I'll gladly re-pull from a branch to keep the PR atomic, if that's preferred @papb

@papb
Copy link

papb commented Mar 10, 2021

Just realized I forgot to put this pr on a branch, so its getting my continued updates. If that's ok, I'm fine with leaving it, but I'll gladly re-pull from a branch to keep the PR atomic, if that's preferred @papb

I would personally prefer if the PR was kept atomic, but note that I am just a random person that happen to watch this repository and suggested some improvements, @leafac is the one who really decides those things :)

That said, I would create a new branch to keep your add-filter-option commit, and fix master to stop on 6e05eca, as follows:

  • Make sure I am checked out in master with everything up to date (git pull -p)
  • git checkout -b add-filter-option
  • git push -u origin add-filter-option
  • git branch -f master 6e05eca (essentially undoing the last two commits, bringing your master back to 6e05eca, where I would have preferred it to stop at)
  • git checkout master
  • git push -f which updates this PR
  • git checkout add-filter-option
  • Continue your work

@leafac
Copy link
Owner

leafac commented Mar 10, 2021

Thanks for your work, y’all. I’ll review this carefully and merge soon…

Regarding atomic pull requests, I don’t care one way or the other 🤷‍♂️

@kylebjordahl
Copy link
Author

@papb - makes sense, I just saw you were marked as a reviewer, didn't know if maybe @leafac delegated to you or something 😄

@leafac - Thanks for taking a look! If it's alright with you, I'd actually like to get my filter option feature in as well, it's only added on the module API and not in code, so it has minimal effect on functionality; I'm using it now in my own build and it's great, as our app generates some local cache files when run in the hot-reload server, and those need to be omitted when building.

@papb
Copy link

papb commented Mar 11, 2021

@kylebjordahl Haha, anyone on GitHub can review anyone's open PRs and GitHub will automatically add that person to the list of reviewers, it's out of control from repository owners.

@leafac
Copy link
Owner

leafac commented Mar 29, 2021

I’ve noticed a trend and I think I found a more general solution for a bunch of problems.

The trend: You’re interested in running npm install on the build directory. Some people are interested in cleaning the devDependencies from the build directory (which caxa should be doing, but apparently there’s a bug there). Other people are interested in removing some directories from the build (most notably, the .git directory).

The solution I propose: Give you an option to run arbitrary commands on the build directory before it’s packaged (right here). These commands may be npm install, npm prune, rm -rf .git, and so forth. Less magic, less stuff for us to maintain in caxa, and more power to the user. Win-win, on my book.

In terms of implementation, I think we should receive a new argument over here called prepare. This argument is a function that’s called at the location I mentioned above. From the command-line, we could add a new option, also called prepare, which is a command to run, so caxa would be called with, for example, npx caxa <other arguments> --prepare "rm -rf git && npm install".

What do you think?

@fcastilloec
Copy link

fcastilloec commented Mar 30, 2021

@leafac you're using fs-extra to copy directory into appDirectory (right here). fs.copy supports a filter function as an option (see here). You can allow the user to specify this function and pass it directly to fs.copy to have a direct way to include/exclude files. This way, we don't have to copy everything and then have to delete stuff, the user can copy exactly what they want/need. You could also implement the function yourself and only takes a list of files to include/exclude from the user.

You can solve the problem specified here if you only copy what's necessary and always exclude node_modules. Instead of running npm prune you run npm ci --production or npm i --production if there's no package-lock.json. This wouldn't be conditional.
This could replace your idea to run arbitrary commands but I think it's better as a complement for it.
If both are implemented, we could do something similar to #12 (comment) directly with caxa by passing it arguments, without having to create an additional temporary directory.

I think it's a good idea to think about what's easier for the average end-user to use in addition to being easier to maintain.
Both approaches have pros and cons. Some people might prefer to run commands to delete files, while others would rather specify what's being copied. I don't know which way is more user-friendly but I would love to have both options 😀

@kylebjordahl
Copy link
Author

@leafac - I'm a huge fan of going in the direction you propose, but I would suggest that it be handled differently between the CLI and the API; while it would be convenient to supply other execa-able commands for use in a CLI (promoting non-Node usage of the package via npx), I think that leads to some unnecessary layers when using the pure JS API.

I also think that @fcastilloec has a great point about exposing copy filter, especially useful in a large monorepo where you might need to cherry-pick things.

I'm glad to rework this PR to add both a prepare and a filter prop to the argument object, and I propose that those are implemented as callback functions; then, in the CLI wrapper:

  • if the user provides --prepare, we treat it as an execa-able string
  • if the user provides --filter we treat it as a glob (perhaps we should do something like --include and --exclude?)

@pdcastro
Copy link

pdcastro commented May 13, 2021

Give you an option to run arbitrary commands on the build directory before it’s packaged [...] These commands may be npm install, npm prune, rm -rf .git, and so forth.

Yep. Just sharing the sentiment, I am not a fan of the npm dedupe line, as it produces an "unstable" npm-shrinkwrap.json file in my project, at least with npm v6. By "unstable" I mean that dedupe makes changes to npm-shrinkwrap.json that are then undone by a subsequent execution of npm install (or npm install --production, as the case may be). It may even be a npm bug, but in any case it's not something that I think caxa should "impose" on applications.

By the way, if any logic is added conditional on package-lock.json, remember npm-shrinkwrap.json as well.

Awesome project! 👏 ❤️

@pdcastro
Copy link

[...] we should receive a new argument over here called prepare. [...] From the command-line, we could add a new option, also called prepare, which is a command to run, so caxa would be called with, for example, npx caxa <other arguments> --prepare "rm -rf git && npm install".

fs.copy supports a filter function as an option (see here). You can allow the user to specify this function and pass it directly to fs.copy to have a direct way to include/exclude files.

to add both a prepare and a filter prop to the argument object, and I propose that those are implemented as callback functions; then, in the CLI wrapper:
if the user provides --prepare, we treat it as an execa-able string
if the user provides --filter we treat it as a glob (perhaps we should do something like --include and --exclude?)

Here's a different suggestion, I think simpler, instead of any of the above (no additional command line options, no filter/prepare callbacks, no include/exclude globs). The application bundled by caxa necessarily has a package.json file, right? So caxa would look for a npm script named, say, "caxa-prepare" in the application's package.json. If it exists, caxa executes npm run caxa-prepare on the the caxa --directory <dir> directory, while exporting some env vars like the CAXA_APP and CAXA_HOME (as proposed in PR #17) so that the user's caxa-prepare npm script can decide where to copy files and where to execute any further commands. The user's caxa-prepare npm script would be a replacement for the following 3 lines currently in caxa's index.ts:

  await fs.copy(directory, appDirectory);
  await execa("npm", ["prune", "--production"], { cwd: appDirectory });
  await execa("npm", ["dedupe"], { cwd: appDirectory });

If the user's package.json did not define a caxa-prepare npm script, then caxa would run a default action which I suggest to be just:

  await fs.copy(directory, appDirectory, { filter: filterFunc });

where filterFunc excludes the .git directory only. Nothing else, no npm prune nor npm dedupe. My reasoning is that this maximizes "success on first attempt". I think prune and dedupe can sometimes break things, usually because users misplaced dev/prod dependencies, but they'd think "caxa doesn't work" instead of thinking "my package.json isn't right". Once users see something that works, then they have the motivation to start digging to optimize size and speed with prune or dedupe, and incrementally fix problems.

caxa could print hints / info messages to the console, like "your package.json contains devDevependencies, consider running npm prune in your "caxa-prepare" script", and he Readme / documentation could include sample caxa-prepare scripts:

  "scripts": {
    "caxa-prepare": "node examples/prepare1.js",
    ...

@leafac
Copy link
Owner

leafac commented May 13, 2021

Hi all,

I’m just checking in to you know that I’m going to work on this soon. I maintain several open-source projects and go around spending some time on each; caxa is up next. I also started streaming my open-source work, which you may follow here: https://www.youtube.com/channel/UC_R-6HcHW5V9_FlZe30tnGA

@kylebjordahl
Copy link
Author

@leafac - I definitely understand the time constraint issue! As I said previously, I'm glad to rework this PR to cover a broader solution, just let me know which direction you'd prefer.

Re: @pdcastro's suggestion of using a convention-based solution with an NPM script, I personally think that would be a great feature, but it does not supersede the need for something provided at the API level, as it adds significant limitations/hurdles to using caxa as part of a build script; I would now need to maintain a separate file for the "hook" logic, even when I'm not invoking caxa from the CLI interface. It makes great sense as a feature to improve the convenience of the CLI (perhaps even replacing my idea of the CLI flags) but doesn't solve the deeper challenge.

@leafac
Copy link
Owner

leafac commented Jun 13, 2021

Hi all,

The changes we’ve all been wanting have landed in [email protected]. Thank you very much for this pull request and for all the investigative work that went into it. While I didn’t merge the pull request directly, we ended up with an approach very similar (and of course, inspired by) to the one y’all proposed here:

  1. The only magical command caxa runs on your build directory before packing your application is npm dedupe --production. In the versions of npm I tested (for example, 7.11.2) this command both removes development dependencies and removes duplicate dependencies. I believe this reduces the size of your average binary significantly while keeping the surprise factor to a minimum. Also, importantly, you couldn’t run this command before calling caxa because caxa itself should be a development dependency.
  2. There are new advanced command-line options for you to customize the build in all ways we discussed in this conversation.
  3. The JavaScript API for caxa gives you even more options, including filter.

Let me know how you like it 😃

@leafac leafac closed this Jun 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants