Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add import command #2580

Merged
merged 1 commit into from
Feb 14, 2017
Merged

Add import command #2580

merged 1 commit into from
Feb 14, 2017

Conversation

xdissent
Copy link
Contributor

Summary

Adds an import command to generate a lockfile from an existing node_modules folder, using the installed versions when it can, and warning when it can't (#682).

It does the following:

  • Checks for existing lockfile and bails if it finds one.
  • Runs verify tree check. Ensures only missing deps are optional, or dev if in production mode.
  • Resolves dependencies using the same mechanisms as install cmd, except:
    • Deps are resolved in groups, in level-order. So all root deps first, then all their deps, etc. Prevents shallow deps from being deduped against deps of deeper deps.
    • Manifests are read from node_modules, starting at parent/node_modules and working back up to cwd.
    • If no package is found on disk, it falls back to normal resolution for the package and its children. It's not an error because it's a "valid" missing dep, but we still need to resolve it for the lockfile (warns).
    • "Reverse" resolves manifest into lockfile format using special npm keys (_resolved, _shasum, gitHead).
    • Warns if a dep is deduped against a previously resolved (higher-level) dep.
  • Runs compatibility checks.
  • Resolves peers.
  • Saves lockfile.
  • Does not touch node_modules.

The goal is to make migration to yarn as easy as possible for folks who currently shrinkwrap, minimizing the differences between the lockfile and their existing tree, and informing them of any changes.

Test plan

I've added tests for some basic scenarios, but I wanted to get some feedback about the approach I took. It requires storing the installed node_modules in the fixtures, and I thought someone might have a better idea. I'm sure there are lots of edge cases in the wild, so an easy way to add import tests seems important.

@bestander
Copy link
Member

That looks cool from the first sight, I would probably use it for migrating old projects that used to check in node_modules.
Did this feature go through RFC?

My concern is that we want to keep Yarn lean enough to keep it maintainable and outsource maintenance of niche features to the community.
Did you think about building it as a separate package?

@ljharb
Copy link

ljharb commented Jan 29, 2017

This isn't a niche feature, this imo is a primary use case - starting to use yarn on an existing project.

@xdissent
Copy link
Contributor Author

Did this feature go through RFC?

It did not, and I'm sorry about that - I completely forgot that was a thing! Happy to get the ball rolling there if you'd like.

My concern is that we want to keep Yarn lean
Did you think about building it as a separate package?

Totally understandable, and yes - this represents my 4th or 5th attempt at a reliable migration tool, and all previous attempts were standalone. I'll spare you the gory details, but I determined a built-in was the only reasonable option. In general, existing support packages for this kind of thing seem to all have their own insane pitfalls, so you have to reinvent the wheel, a lot. Then you inevitably need some yarn apis in order to be accurate, but they're all private and very hard to get at from an external package (lib vs lib-legacy, roadrunner, etc). Also, verify tree check got added, which was the last big chunk of work yarn wasn't already doing.

Since a standalone is such a large/annoying undertaking, and the nature of the tool is such that most people only need to run it once and then never use it again, I honestly don't expect to see a standalone any time soon. In fact, a major motivation for this PR is that I've already used it for all my projects and wanted to get the code out there before I forgot about it forever. =)

@bestander
Copy link
Member

@xdissent, I really appreciate your effort, thanks for getting this out for the community.
Let's gather more feedback.
I think full RFC does not make sense considering that you have a solution here.
But we need to give people some high level idea about the impact of the change.
Could you post RFC template filled in as a comment here please?

@xdissent
Copy link
Contributor Author

Summary

Import command to generate a yarn lock file from an existing npm-installed node_modules folder.

Motivation

Many projects currently use npm shrinkwrap or check their node_modules into source control because they have fragile dependency trees. These projects can't easily migrate to yarn, because yarn install could produce a wildly different logical dependency tree. Not all trees can be represented by yarn's lock file format, and some valid trees will be automatically deduped upon install. These nuances and others present a significant barrier to manual migration and migration-tool development, without a deep understanding of yarn internals.

A yarn import command should generate a "best effort" yarn lock file using the versions found in node_modules according to normal require.resolve() resolution rules. In cases where the yarn resolution mechanism can't satisfy the existing dependency tree identically, the user should be warned so they may manually review the changes. The existing node_modules tree should be checked for validity beforehand and the resultant lock file should be yarn installable without any surprises (failed compatibility, unresolvable deps, auto-dedupes, etc).

Detailed design

The import command is implemented as a subclass of the install command. The significant differences are that it uses a ImportPackageResolver rather than a PackageResolver, it runs verifyTreeCheck(), and it doesn't run the linker copy-modules phase.

The ImportPackageResolver finds dependencies in level-order, rather than the tree order of PackageResolver. It does this by overriding the find() method so that child dep requests are added to a list, and are resolved when the parent requests have all finished. This was found to be important to minimize changes to the tree, since shallow deps are more likely to have already been deduped by npm.

The resolver creates ImportPackageRequests - a subclass of PackageRequest that adds warnings when it is being deduped out of the tree - and attempts to find it's manifest using a ImportResolver subclass of BaseResolver. If the manifest can't be found, it falls back to the normal PackageRequest implementation and sets a flag so any child requests skip the ImportPackageResolver behavior altogether. Since the existing tree has been verified as valid, this only happens in the case of a missing optional dependency, or a dev dependency when in production mode. A warning is issued and the request is resolved normally.

The ImportResolver reads manifests from disk by looking in the places require() would. It starts from the parent request's node_modules folder (or <config.cwd>/node_modules for root deps) and moves up the tree until it reaches config.cwd. It then parses the manifest's special npm keys into a suitable yarn _uid and _remote by looking at the type of resolver yarn would normally use to find the manifest.

The only functional code change outside of the import command is that PackageRequest.find() now calls a reportResolvedRangeMatch() method, which is a noop in its implementation. All other changes are adding exports and imports.

How We Teach This

The import name made sense to me, and I feel like it's not a very commonly used package.json script name, so hopefully collisions with people already using yarn import would be minimal. The documentation should probably (exist and) be featured prominently in new-user facing content, since it may be the first interaction some users have with yarn.

Drawbacks

There are many different npm versions in the wild, and all with potential edge cases, so maintenance may become a problem. Changes to yarn internals could also require updates or additions to the command's implementation.

Alternatives

I've tried a handful of approaches to the problem but was ultimately convinced a built-in command was the only reasonable path forward. I also think this is such a common scenario that if an external migration tool did exist, many first time users could have a negative impression of yarn if the tool didn't perform well.

Unresolved questions

  • Suggestions on testing.
  • Would it be useful to allow manual dupe resolution using the interactive reporter methods?

@bestander
Copy link
Member

bestander commented Feb 1, 2017

Great job all around, @xdissent.
Great RFC, great number of tests and really good that you got into Yarn internals.
I think the implementation is quite safe to land, not much code was touched.
Just let me review the import command once more in a couple of days and I think we are good to go (unless someone vetoes).

FYI, we are working on a JS API for plugins and wrappers, RFC should be coming soon.
Would you participate to make this command one of the first plugins?
We'll need to keep the core slim in the long run.

@xdissent
Copy link
Contributor Author

xdissent commented Feb 1, 2017

Thanks! And sure thing re: plugin. I'll watch for the RFC and chime in if I foresee any potential implementation problems given the proposed API.

@bestander bestander merged commit 4d59f5b into yarnpkg:master Feb 14, 2017
@bestander
Copy link
Member

Great work!
Keep more coming :)

@Nantris
Copy link

Nantris commented Feb 22, 2017

A million times thank you for this. I cannot explain how helpful this feature has been. I was in an npm dependency nightmare and yarn import was just the tool I needed to build a proper dependency tree from a working node_modules folder without having to resort to shrinkwrap... ::shudder::

Thank you again.

@rattrayalex-stripe
Copy link

rattrayalex-stripe commented Oct 27, 2017

I am migrating checked-in node_modules to yarn. This is an awesome tool!

Are there tips anywhere for how best to resolve errors of the nature of error "some-module#other-module" is wrong version: expected "~0.1.8", got "0.4.1"?

@vladholubiev
Copy link

@rattrayalex-stripe I've encountered this problem when my node_modules were installed from package.json with loose version like ^0.1.8 but on the machine were I ran yarn import package.json had version specified as 0.1.8 (without a ^)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants