[`pycodestyle`] Add E12 rules (continuation lines indentation rules) #8557

hoel-bagard · 2023-11-08T13:09:44Z

Summary

This PR is part of #2402, it adds the E12 rules (continuation lines indentation rules).

Test Plan

The test fixture uses the one from pycodestyle, except for E133 for which there is no fixture since it is an opt-in config option (same as pycodestyle) and I do not know how to handle it.

Discussion

This PR replaces the add E122 PR.

Performance issue

The changes from the PR seem to slow down ruff quite a bit, I'm guessing that the get_token_infos function is the slow part, but it was needed to follow the pycodestyle implementation. Is there a way to implement it that would be faster ?

Extra rules

The PR is essentially a port of the pycodestyle implementation from python to rust. I implemented all the errors present in that function before realizing that E121, E123, E126 and E133 are not in the #2402 list. Should I remove them from the PR ?

Test error

When running the tests, the following deserialization test fails:

        assert!(toml::from_str::<Pyproject>(
            r#"
[tool.black]
[tool.ruff.lint]
select = ["E123"]
"#,
        )
        .is_err());

I'm not sure what to do about it, although I suppose the issue would disappear if E123 is removed from the PR.

Add E122 rule and base for adding the other E12 rules, based on the pycodestyle implementation. Autofix and configuration are not included.

…le implementation is broken.

This reverts commit 239c164.

Change indent_chances into a HashMap to keep track of what kind of indent was on which line.

codspeed-hq · 2023-11-08T13:19:35Z

CodSpeed Performance Report

Merging #8557 will degrade performances by 7.6%

_{Comparing hoel-bagard:hoel/add_E12_rules (382429c) with main (7391f74)}

Summary

❌ 5 regressions
✅ 20 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`hoel-bagard:hoel/add_E12_rules`	Change
❌	`linter/all-rules[pydantic/types.py]`	74.5 ms	80.6 ms	-7.6%
❌	`linter/all-rules[large/dataset.py]`	174 ms	187.2 ms	-7.03%
❌	`linter/all-rules[unicode/pypinyin.py]`	16.3 ms	17.2 ms	-5.15%
❌	`linter/all-rules[numpy/ctypeslib.py]`	34.8 ms	37.3 ms	-6.83%
❌	`linter/all-rules[numpy/globals.py]`	3.9 ms	4.2 ms	-5.59%

Also add the rule examples to KNOWN_FORMATTING_VIOLATIONS.

Avasam · 2023-12-14T07:50:35Z

As a note, fixes for these would bring Ruff closer to parity with autopep8 #9057 (comment)

spaceone · 2024-02-18T19:21:56Z

This is nearly the last missing pycodestyle check. Having it would allow use to throw away the slow pycodestyle, flake8 and autopep8 pre-commit hooks.

hoel-bagard · 2024-02-25T11:00:16Z

@MichaReiser sorry to ping you here, but I would appreciate some feedback/advice on this PR.

I had implemented the E12 rules for completeness (to get closer to having all the boxes checked in #2402), but I went at it lazily and used the same logic as pycodesyle, which resulted in the PR not even passing the CI.

Do you think it would be worth it to spend time trying to fix/rework it, or should this PR be closed and forgotten ?

MichaReiser · 2024-02-25T12:11:23Z

@hoel-bagard I try to take a look on Monday. I've a lot of catch up to do on PRs and I'm, unfortunately, not as fast as @charliermarsh

MichaReiser

Hey @hoel-bagard

I haven't reviewed the code yet but I read through the rules.

My main concern is that not all rules are formatter-compatible. For example, ClosingBracketNotMatchingOpeningBracketVisualIndentation is guaranteed to conflict with all formatted code. This is problematic because many users use --select ALL, which enables these new rules, and they then get a lot of errors if they use our formatter (and see an incompatibility warning when running format). That's why we want to hold off from adding new formatting-related lint rules that are formatter incompatible until we're done with the rules re-categorization where we are likely to have a format (or similar) related category that we only recommend for users that do not use our formatter.

I'm having a hard time judging if other rules conflict with the formatter, too, from just reading their documentation. E.g. I'm unsure about ContinuationLineOverIndentedForHangingIndent, ContinuationLineOverIndentedForVisualIndent,

I believe VisuallyIndentedLineWithSameIndentAsNextLogicalLine and ContinuationLineUnalignedForHangingIndent are incompatible, too (at least today, we are considering changing that).

I'm sorry that this means we can't move forward with the rules as they are now. We could explore limiting the rules to the formatter-compatible ones or finding a way to make them formatter-compatible.

MichaReiser · 2024-02-26T08:39:31Z

crates/ruff_linter/src/rules/pycodestyle/rules/logical_lines/continuation_lines.rs

+}
+
+/// ## What it does
+/// Checks for continuation line over-indented for visual indent.


The difference between this and ContinuationLineOverIndentedForHangingIndent and ContinuationLineUnderIndentedForVisualIndent isn't clear to me from reading the documentation. Are these different styles and users should only enable one of those rules?

It seems to me like they could both apply in one codebase, although that would not be a very consistent codebase.

Looking at the fixtures from pycodestyle, E126 (ContinuationLineOverIndentedForHangingIndent) seems to apply when there is a newline after the symbol causing the continuation line ((, \, etc...) whereas E128 (ContinuationLineUnderIndentedForVisualIndent) applies when there is at least one value after the start of the continuation line.

For example this is a E126 error

print("E126", ( "1", "2", ))

Whereas this is a E128 error:

print("E128", ("1", "2", ))

See also the unofficial rules explanation for E126 and E128

MichaReiser · 2024-02-26T08:41:47Z

crates/ruff_linter/src/rules/pycodestyle/rules/logical_lines/continuation_lines.rs

+
+    for ch in line.chars() {
+        if ch == '\t' {
+            indent = indent / 8 * 8 + 8;


This would need to support the new indent_width setting

Corentin-pro · 2024-02-26T15:41:46Z

Hello, I want to emphasize that this addition would be very welcome. I had many instances where I would have liked ruff to tell me about over-indentation (E127) when I refactored my code (removing arguments from a function that changed the indentation).

I am interested in this PR. If I have the time and the code is not yet OK to be merge I'll look into this in the next month.

hoel-bagard · 2024-02-27T00:40:03Z

Having rules that conflict with the formatter would be quite annoying indeed (I already noticed the formatter disagreeing with flake8, and that meant having to add an exception to the flake8 config).

I tried running ruff with the E12 rules on one (small) codebase that has been formatted with the ruff formatter, and got no error.

I tried to make an example that would cause the formatter and the E129 (VisuallyIndentedLineWithSameIndentAsNextLogicalLine) to disagree, but without success.

The following triggers E129:

if (var in "one_looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong_string"
    and var not in "another_looooooooooooooooooooooooooooooooooooooooooooooooooooong_string"):
    ...

But it get formatted into the E12 compliant code:

if (
    var
    in "one_looooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong_string"
    and var
    not in "another_looooooooooooooooooooooooooooooooooooooooooooooooooooong_string"
):
    ...

The docstring examples and fixtures would definitely need to be simplified, I'll try to do that today/tomorrow. I wanted to run the formatter on the E12 fixtures, but it uses python2 style code, which prevents the formatter from formatting the code.

hoel-bagard · 2024-02-27T02:53:27Z

@MichaReiser I ran ruff format on the E12.py fixtures file, and it fixed all but two of the errors.

The errors that did not get fixed automatically are both E125 errors:

if """
    """:
    pass

for foo in """
    abc
    def
    """.strip().split():
    print(foo)

They can be fixed by manually formatting them to:

if """""":
    pass

for foo in """
    abc
    def
""".strip().split():
    print(foo)

(this changes the content of the string, but that's not really the point here)

At which point the formatter does not attempt to change the file anymore, and both pycodestyle and the rules added in this PR do not detect any E12 violation.

So it seems like the E12 rules are not incompatible with the formatter.

MichaReiser · 2024-03-04T13:01:57Z

I haven't forgotton about this PR but we first need to decide internally if we want to add more formatting lint rules and my priority for now is to fix incompatibilities with existing rules.

MichaReiser · 2024-03-28T17:10:22Z

Thank you @hoel-bagard for your patience and sorry that it took me so long.

I started reviewing the rules but stopped after E125 because discovering all possible formatter incompatibilities took a very long time, and I generally struggled to understand the rules or how they differ (this isn't a critique, it's just me struggling and not being familiar with the rules).

For E121: I see how the rule is useful, but the rule is in conflict with some possible expression formatting that I want to try out in the formatter. That's why I want to hold off adding this specific rule, at least until #1774 is complete. I heard @AlexWaygood has ideas on how to support formatter incompatible rules ;)

a = (
    a 
    + b
    + len([
        1, 
        2, 
      ]) # NOK
    + more([
        a, 
      ]) # NOK
)

I'm generally open to accepting the rules E122 and E125 (I still have to review the other rules), but I would prefer if we could tackle one rule at a time in separate PRs. I'm aware that this asks for extra work on your side and I'm sorry for this, but I don't think I can prioritize reviewing all the additions in a single PR right now. Adding the rules one by one also has the benefit that it reducing the risk of regressions because we can ship them independently (we would still have the regression, but we would receive the feedback across multiple releases)

The other thing we need to consider is that the rule seems to regress performance somewhat significantly. I think we need to figure out a design that reduces the performance less. This could involve refactoring existing pycodestye rules to see if performance can be improved.

What do you think? Do you want to pursue the rules in individual PRs (please don't open all PRs at once, let's do one at a time)?

hoel-bagard · 2024-03-31T02:01:34Z

@MichaReiser

Thanks for taking time to look into it.
While I use the formatter, I never looked too much into its rules, so I don't know about the incompatibilities with the E12 rules. I'll try to check for incompatibilities while working on it.

Implementing the rules one by one sounds good to me, that would also likely make understanding what each rule does easier.

The other thing we need to consider is that the rule seems to regress performance somewhat significantly. I think we need to figure out a design that reduces the performance less. This could involve refactoring existing pycodestye rules to see if performance can be improved.

I suspect that working directly on tokens would help with the performance issue. I used logical lines because that's what pycodestyle does, but I don't think that was a good idea.

What do you think? Do you want to pursue the rules in individual PRs (please don't open all PRs at once, let's do one at a time)?

I can try to start working on E122 and see how it goes. I won't spend time on the other rules before that one gets merged (there's no point working on the other rules before solving the formatter incompatibilities / performance issues).

MichaReiser · 2024-04-02T14:03:29Z

I suspect that working directly on tokens would help with the performance issue. I used logical lines because that's what pycodestyle does, but I don't think that was a good idea.

Logical lines might be fine, I'm not sure. One problem I noticed while skimming over the code is that we allocate multiple vectors and hash sets for each line. I think we should look into how we can avoid that. It can mean that we should use something other than logical line, reuse the allocations across check calls etc.

I can try to start working on E122 and see how it goes. I won't spend time on the other rules before that one gets merged (there's no point working on the other rules before solving the formatter incompatibilities / performance issues).

Thanks a lot! I'll close this PR. We can still use it as a reference. I hope that's okay with you.

hoel-bagard · 2024-04-03T13:04:03Z

I think the main issue with logical lines is that they have to be sliced to look for new lines (like here for example). This currently happens for every token, which I assume is the slow part.

One problem I noticed while skimming over the code is that we allocate multiple vectors and hash sets for each line. I think we should look into how we can avoid that.

If you're referring to this, then it's being done for every token, which is most likely more than needed indeed.

I'll close this PR. We can still use it as a reference. I hope that's okay with you.

No issue at all. I'd like to keep using this PR as a reference / discussion point to avoid going into a wrong direction.

hoel-bagard and others added 21 commits September 25, 2023 22:29

Add empty rule.

82e85a7

Add E122 rule.

8a9c8dd

Add E122 rule and base for adding the other E12 rules, based on the pycodestyle implementation. Autofix and configuration are not included.

Merge branch 'astral-sh:main' into add_E122

5113131

Fix some clippy issues.

8f5d6fc

Use i64 for most variables to avoid conversions.

a72b89a

Began adding the other E12 rules, however it seems like the pycodesty…

3e74e8d

…le implementation is broken.

Remove other rules' mentions.

239c164

Merge.

45688ec

Revert "Remove other rules' mentions."

c925bcf

This reverts commit 239c164.

Add rule to KNOWN_FORMATTING_VIOLATIONS

206b4dd

Rename file to account for future E12 rules.

de4aec5

Merge branch 'astral-sh:main' into add_E122

e0b2e62

Remove non-E122 tests.

0bc5d29

Merge.

d053fde

Add E128, E129, E131 and E133 rules.

c8ce8d5

Register E12/E13 rules.

9cb3af1

Fix E125.

e323ec0

Fix E126 edge case.

e669cc9

Change indent_chances into a HashMap to keep track of what kind of indent was on which line.

Add hang-closing as a configuration option.

1da9cf0

Update snapshots.

fab2f50

Merge branch 'astral-sh:main' into hoel/add_E12_rules

66ad8fd

hoel-bagard mentioned this pull request Nov 8, 2023

[pycodestyle] Add E122 rule ("continuation line missing indentation or outdented") #7655

Closed

Fix typo, clippy lints

f398c96

Also add the rule examples to KNOWN_FORMATTING_VIOLATIONS.

hoel-bagard force-pushed the hoel/add_E12_rules branch from c53a28f to f398c96 Compare November 8, 2023 13:44

hoel-bagard added 2 commits November 8, 2023 23:31

Remove E133 test case.

114b505

Remove E133 test case.

382429c

hoel-bagard mentioned this pull request Nov 12, 2023

Implement remaining pycodestyle rules #2402

Open

70 tasks

MichaReiser added rule Implementing or modifying a lint rule preview Related to preview mode features labels Feb 25, 2024

MichaReiser reviewed Feb 26, 2024

View reviewed changes

hoel-bagard added 2 commits February 27, 2024 11:19

fix: update fixtures's prints to python3

11deb1f

update snapshots

dba8819

zanieb added the incompatibility Incompatibility between different Ruff tools or rules (formatter, isort, linter) label Mar 12, 2024

MichaReiser closed this Apr 2, 2024

augustelalande mentioned this pull request May 9, 2024

[pycodestyle] Implement missing-or-outdented-indentation (E122) #11349

Open

LordAro mentioned this pull request Oct 7, 2024

[pycodestyle] Implement continued indentation related rules (E12x) #13585

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`pycodestyle`] Add E12 rules (continuation lines indentation rules) #8557

[`pycodestyle`] Add E12 rules (continuation lines indentation rules) #8557

hoel-bagard commented Nov 8, 2023 •

edited

Loading

codspeed-hq bot commented Nov 8, 2023 •

edited

Loading

Avasam commented Dec 14, 2023 •

edited

Loading

spaceone commented Feb 18, 2024

hoel-bagard commented Feb 25, 2024

MichaReiser commented Feb 25, 2024

MichaReiser left a comment •

edited

Loading

MichaReiser Feb 26, 2024

hoel-bagard Feb 27, 2024

MichaReiser Feb 26, 2024

Corentin-pro commented Feb 26, 2024

hoel-bagard commented Feb 27, 2024

hoel-bagard commented Feb 27, 2024

MichaReiser commented Mar 4, 2024

MichaReiser commented Mar 28, 2024

hoel-bagard commented Mar 31, 2024

MichaReiser commented Apr 2, 2024

hoel-bagard commented Apr 3, 2024

[pycodestyle] Add E12 rules (continuation lines indentation rules) #8557

[pycodestyle] Add E12 rules (continuation lines indentation rules) #8557

Conversation

hoel-bagard commented Nov 8, 2023 • edited Loading

Summary

Test Plan

Discussion

Performance issue

Extra rules

Test error

codspeed-hq bot commented Nov 8, 2023 • edited Loading

CodSpeed Performance Report

Merging #8557 will degrade performances by 7.6%

Summary

Benchmarks breakdown

Avasam commented Dec 14, 2023 • edited Loading

spaceone commented Feb 18, 2024

hoel-bagard commented Feb 25, 2024

MichaReiser commented Feb 25, 2024

MichaReiser left a comment • edited Loading

Choose a reason for hiding this comment

MichaReiser Feb 26, 2024

Choose a reason for hiding this comment

hoel-bagard Feb 27, 2024

Choose a reason for hiding this comment

MichaReiser Feb 26, 2024

Choose a reason for hiding this comment

Corentin-pro commented Feb 26, 2024

hoel-bagard commented Feb 27, 2024

hoel-bagard commented Feb 27, 2024

MichaReiser commented Mar 4, 2024

MichaReiser commented Mar 28, 2024

hoel-bagard commented Mar 31, 2024

MichaReiser commented Apr 2, 2024

hoel-bagard commented Apr 3, 2024

[`pycodestyle`] Add E12 rules (continuation lines indentation rules) #8557

[`pycodestyle`] Add E12 rules (continuation lines indentation rules) #8557

hoel-bagard commented Nov 8, 2023 •

edited

Loading

codspeed-hq bot commented Nov 8, 2023 •

edited

Loading

Avasam commented Dec 14, 2023 •

edited

Loading

MichaReiser left a comment •

edited

Loading