Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore unresolved requirements #96

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

bwhmather
Copy link
Owner

@bwhmather bwhmather commented May 11, 2023

Early versions of ssort were very aggressive about regrouping dependencies. This meant that a typo that resulted in dependency resolution failing could cause quite dramatic reordering. To protect against this we added a check to make ssort exit with an error if any requirements could not be resolved.

Since then, we have reduced the scope of ssort to only do a simple topological sort. Typos are no longer likely to result in significant shuffling. In spite of this, we have kept the logic for bailing out when requirements can't be resolved.

The errors this logic results in can be very irritating during development, and tend to block subsequent auto-formatters from running.

I propose that we simply remove it.

@bwhmather bwhmather force-pushed the ignore-unresolved branch from 0bf0d78 to 3955431 Compare May 11, 2023 20:18
@github-actions
Copy link

github-actions bot commented May 11, 2023

Pull Request Test Coverage Report for Build 7576666998

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.004%) to 98.25%

Totals Coverage Status
Change from base Build 7576240521: 0.004%
Covered Lines: 1235
Relevant Lines: 1257

💛 - Coveralls

@bwhmather bwhmather marked this pull request as ready for review May 11, 2023 20:27
@bwhmather bwhmather requested a review from jgberry May 11, 2023 20:27
@cgahr
Copy link
Collaborator

cgahr commented May 11, 2023

Tbh I don't completely understand how ssort is supposed to work after removing this code. How are unresolved dependencies handled? Just ignored?

@bwhmather
Copy link
Owner Author

How are unresolved dependencies handled? Just ignored?

Just ignored. That's it.

The problem we had before is that we tried to do grouping. If someone made a typo, everything would be shuffled.

For example:

def dep1():
    pass

def dep2():
    pass
    
def func():
   dep1() 
   deb2()  # typo

Would be rejiggled to

def dep2():
    pass

def dep1():
    pass
    
def func():
   dep1()
   deb2()  # typo

Fixing the type would shuffle everything back.

Now we only rejiggle if someone makes a typo that breaks a cycle so the check is much less valuable.

@bwhmather bwhmather requested a review from cgahr May 11, 2023 21:08
@jgberry
Copy link
Collaborator

jgberry commented May 11, 2023

Hmm, I feel like we need to be careful with this because there are tradeoffs. As you mention, the reshuffling of cyclical dependency chains when a typo is introduced is the big downside here. To me, the biggest problem is that even after the typo is fixed, the statements will remain in their reshuffled state, which is ultimately some arbitrary order determined by the typo you made. Is it worth introducing this? Quite frankly, I'm not exactly sure what positives this change buys us... we already do this when a wildcard import is detected in a file, so it doesn't really resolve any issues there. Is this just for the sake of simplifying the codebase and avoiding the maintenance of adding new builtins for each new python version? Because, if so, I'm not sure this change is worthwhile.

@jgberry
Copy link
Collaborator

jgberry commented May 11, 2023

The errors this logic results in can be very irritating during development, and tend to block subsequent auto-formatters from running.

Can we have some examples of these errors? I feel like we should dissect these errors first, see if there are any alternative solutions for them, and then make the call on whether or not this change is worth it.

@bwhmather
Copy link
Owner Author

Can we have some examples of these errors?

In my usual workflow I will have ssort and a few other auto-formatters running every save, typically stopping after the first error or when my hand heats the spacebar sufficiently.

Something like this, but usually with the commands wrapped up in a project specific script or makefile:

$ rerun sh -c 'ssort && isort && black'

Ignoring the debate about factoring out steps into functions, if I'm working on a new module I might do something like this:

def top_level_function():
    _step1(
    )

    _step2(
    )


    _step3()

At the moment, my formatting loop will exit immediately with:

ERROR: unresolved dependency '_step1' in example.py: line 6, column 4
ERROR: unresolved dependency '_step2' in example.py: line 9, column 4
ERROR: unresolved dependency '_step3' in example.py: line 13, column 4
1 file was not sortable

Nothing will be changed.

With this change I instead get:

1 file was left unchanged
reformatted example.py

All done! ✨ 🍰 ✨
1 file reformatted.

And the file gets reformatted to:

def top_level_function():
    _step1()

    _step2()

    _step3()

Adding pyflakes to the chain gives:

example.py:2:5: undefined name '_step1'
example.py:4:5: undefined name '_step2'
example.py:6:5: undefined name '_step3'

As such, nothing is lost from the point of view of helping users catch errors.

Obviously, not exiting on first error would also solve my problem.

@bwhmather
Copy link
Owner Author

Is this just for the sake of simplifying the codebase and avoiding the maintenance of adding new builtins for each new python version?

Our ballooning builtins list is a factor, but mostly because I think it indicates that we are doing something fishy. If we can safely ignore some requirements that can't be resolved in practice then why not all?

@bwhmather
Copy link
Owner Author

Regarding cycles: If the developer adds links in the cycle one at a time then we can already get the unwanted reshuffling. I think the only time when this change would result in an unwanted reshuffle that wouldn't be made before is when a developer attempts to complete all links in a cycle but makes a typo in one of them.

I've run a hacked up version of ssort that flags cycles on all of the projects that I currently have checked out, and the only one it detects is between statement_text_sorted and _statement_text_sorted_class in _ssort.py in this project. Interestingly, the current order is consistent with _statement_text_sorted_class being implemented as a stub that was called from statement_text_sorted before being fleshed out and completing the cycle.

Most of my recursive tree traversals use single dispatch, which tends to magically break these syntactic cycles. I remembered and found one other cycle that was broken up using separate modules and a lot of comments.

@cgahr
Copy link
Collaborator

cgahr commented May 12, 2023

I actually think we need this kind of check for the builtin functions. Consider (terrible) code like this:

class A:
    print = print

def print(s):
    ...

Depending on whether ssort knows that print is a builtin function or not, it should sort the code differently.

In any case, I think we should leave this part of the code as it is (for now) and try improve the sorting such that there is a unique way of sorting functions. having this, I think we can determine much more easily if we need a list of builtins or not.

@bwhmather
Copy link
Owner Author

I actually think we need this kind of check for the builtin functions. Consider (terrible) code like this:

class A:
    print = print

def print(s):
    ...

Depending on whether ssort knows that print is a builtin function or not, it should sort the code differently.

Interesting. That is a real problem. I don't think it's necessarily a reason to fail on unresolved requirements, but it would definitely seem to be a reason to keep the list of builtins.

Unfortunately, polyfills make things more complicated. A contrived example:

def function():
    raise ExceptionGroup()

if sys.version_info < (3, 11):
    class ExceptionGroup(Exception):
        ...

@bwhmather
Copy link
Owner Author

improve the sorting such that there is a unique way of sorting functions.

Interestingly, I don't think this is actually a solution. Increasing the strictness makes the final outcome more predictable, but leads to a lot of churn during iteration. The unresolved requirements check was not introduced because the sorting was too lenient but because it was too strict.

@bwhmather
Copy link
Owner Author

Depending on whether ssort knows that print is a builtin function or not, it should sort the code differently.

Interesting. That is a real problem. I don't think it's necessarily a reason to fail on unresolved requirements, but it would definitely seem to be a reason to keep the list of builtins.

Ok, I'm going to go a little bit further than "would seem to be a reason". You are right and we can't merge as is.

@bwhmather
Copy link
Owner Author

Have restored the list of builtins in order to avoid the shadowing problem

@bwhmather bwhmather force-pushed the ignore-unresolved branch from 79e2d40 to 8a0f021 Compare May 12, 2023 21:00
@bwhmather bwhmather force-pushed the ignore-unresolved branch 2 times, most recently from 1055529 to aa03bd5 Compare December 29, 2023 22:59
@bwhmather bwhmather force-pushed the ignore-unresolved branch 2 times, most recently from 18a1d77 to c8fc7f4 Compare January 29, 2024 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants