Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: Discard candidates that do not match a requirement's extras #11913

Closed
1 task done
notatallshaw opened this issue Mar 31, 2023 · 7 comments
Closed
1 task done

Idea: Discard candidates that do not match a requirement's extras #11913

notatallshaw opened this issue Mar 31, 2023 · 7 comments
Labels
C: dependency resolution About choosing which dependencies to install type: feature request Request for a new feature

Comments

@notatallshaw
Copy link
Member

What's the problem this feature will solve?

Currently when a requirement has an extra but a candidate contains no such extra a warning is printed: https://github.com/pypa/pip/blob/23.0.1/src/pip/_internal/resolution/resolvelib/candidates.py#L498

This scenario could happen either because:

  1. The package extra is incorrectly specified
  2. Or, when backtracking the package extra is provided for some versions of the package and not others

Both of these situations are likely not desirable, for example if an extra is chosen but backtracking goes back to the point where that extra doesn't exist likely this is not a desired version of the package.

In general by having these extras be optional there is an ambiguity in the requirements meaning a user can't be sure whether they are getting foo[bar] or just foo.

Describe the solution you'd like

Discard candidates that do not include the extra, reducing the size of the space needed to search to find a solution when backtracking.

Alternative Solutions

  • Do nothing
  • Throw exception on finding an invalid extra, forcing the user to give more specific requirements
  • Add a flag on to requirements that could be used to "unprefer" requirements that don't match the extra, so at an extra is more likely to be found

Additional context

This warning came up in many cases where I purposefully tried to create obtuse requirements that would take a long time to resolve as part of testing #11908.

Also this would be slightly backwards incompatible in terms of resolution, I suspect if accepted it would need to be introduced behind a feature flag.

Code of Conduct

@notatallshaw notatallshaw added S: needs triage Issues/PRs that need to be triaged type: feature request Request for a new feature labels Mar 31, 2023
@notatallshaw
Copy link
Member Author

notatallshaw commented Mar 31, 2023

Thinking on it a bit more is the current behavior introduces some "if logic" in to the requirements that you would lose from my proposal.

For example let's say you specified foo[bar] that foo has 2 versions, version 1 which has not extras and version 2 which has extra bar which points to the package bar. What users are currently getting when specifying foo[bar] is an implicit "if foo==2 then install foo and bar else install foo".

I guess I don't understand if this is intended and/or desirable.

@pradyunsg
Copy link
Member

pradyunsg commented Mar 31, 2023

Ignoring missing extras during the resolution process is intentional, and intended for backwards compatibility with the behaviour we had in the legacy resolver. Do you have an example set of requirements that demonstrate the problematic behaviour here?

I think having clearer management of extras and their direct dependency could be useful (eg: enforcing a visit of the foo==2.0 candidate after visiting foo[bar]==2.0 candidate) and might end up being a solution, and especially if it is eagerly triggering a rejection of the variant with extras that isn't compatible (I'm not sure if that's already a behaviour of the resolver).

Add a flag on to requirements that could be used to "unprefer" requirements that don't match the extra, so at an extra is more likely to be found

Since we order a requirement/candidate for visiting prior to getting the dependency metadata, I don't think this is possible.

Throw exception on finding an invalid extra, forcing the user to give more specific requirements

This would be a significant breaking change and it's not clear what the value of this would be.

@pradyunsg pradyunsg added C: dependency resolution About choosing which dependencies to install and removed S: needs triage Issues/PRs that need to be triaged labels Mar 31, 2023
@uranusjr
Copy link
Member

If we treat a package with extra as a logical dependant of the real package (i.e. how the resolver constructs this internally), specifying a non-existent extra would correspond to specifying a non-existent project as dependency. Discarding a candidate without the matching extras is the natural thing to do following the same logic. The current behaviour is so because the legacy resolver does not actually treats a package with extras as a dependant, but performs some convoluted merging logic with unclear original intention that’s probably forever lost in the memory lane.

I guess what I’m trying to say is I think the proposal is a logically sound thing to do. The problem is however compatibility. We do have means to do such things (use-features and use-deprecated), but I think the missing piece is figuring out what may break with this change, and how we should advise affected users to fix the problem in their environments.

@pradyunsg
Copy link
Member

In agreement with everything that @uranusjr says, and that the trickiest piece for end users is when a transitive dependency declares a dependency on a non-existent extra. Since they won’t have direct control over stuff, they can’t dictate how things should behave.

@notatallshaw
Copy link
Member Author

notatallshaw commented Apr 5, 2023

Do you have an example set of requirements that demonstrate the problematic behaviour here?

So while I do have some examples that display this behavior the more motivating idea came from the recent addition of extras to Pandas requirements: https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#installing-optional-dependencies-with-pip-extras

If someone two years from now adds pandas[excel] to their requirement but due to backtracking with other requirements ends up install pandas 1.5.3 and they miss the warning in their giant install log:

WARNING: pandas 1.5.3 does not provide the extra 'excel'

They're going to be surprised when none of the excel functions work even though "pip installed the requirements fine", or worse pip is going to go wild backtracking unnecessarily.

As has been stated this particular solution is probably not the way, but I do think it is a real concern as extras are increasingly used.

@notatallshaw
Copy link
Member Author

notatallshaw commented Apr 14, 2023

#11961 is a real life example of this issue happening. As you can see Pip finds a solution to these requirements e.g. this command works on Pip 23.0.1 but with multiple warnings:

pip download "boto3" "pyiceberg[s3fs,glue,pyarrow,duckdb]"

But this command fails because Pip has found a solution by discarding the extra component which the user never intended as optional:

python install "boto3" "pyiceberg[s3fs,glue,pyarrow,duckdb]"

Perhaps if changing the current behavior is not acceptable then a new syntax could be introduced where a user can explicitly state an extra is non-optional?

p.s. this specific example is fixed by the backtracking improvements in Pip 23.1+

@notatallshaw
Copy link
Member Author

Closing this on Pip side for now. I still strongly think that Pip should at least implement an option, but I suspect another configuration option that affects resolution will add a support burden greater than Pip is capable of.

If users have a great need for this option then rip provides a configuration option for this. I am often testing rip with lots of tricky resolution situations, and hope this year it will be mature enough to reccomend to users who need a feature it offers that Pip doesn't.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C: dependency resolution About choosing which dependencies to install type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

3 participants