-
-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: New Python-based terraform_fmt
hook to better support Windows
#652
base: master
Are you sure you want to change the base?
Conversation
53e3fb0
to
e12fae3
Compare
What repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: e12fae3faeaf2ffcf65b1d08af91d7b64d349b31
hooks:
- id: terraform_fmt_v2 This one? |
yes, that one works for me on Windows. |
Up for discussion is the naming of this hook. Should we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks neat 👍🏻
For the sake of visual comparison, could you please attach:
- screenshots of the output of both bash and python hooks running on the same code and formatting it
- timings for the above (it makes sense to run this over several dirs with a dozen files in each that need formatting).
Thanks
I'd keep the new hook with It would also be great if you have doc-strings added to each function. Thanks in advance. |
terraform_fmt_v2
to better support Windows
terraform_fmt_v2
to better support Windowsterraform_fmt
hook to better support Windows
➜ time pre-commit run -a
[INFO] Installing environment for https://github.com/antonbabenko/pre-commit-terraform.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
An unexpected error has occurred: CalledProcessError: command: ('/usr/bin/python3', '/home/vm/.cache/pre-commit-zipapp/foYifN81FLsOKwa_AbWJjbHnCx9JWWVug44854ojl8o/python', '-mvirtualenv', '/home/vm/.cache/pre-commit/repow3wud7r1/py_env-python3.10')
return code: 1
stdout: (none)
stderr:
Traceback (most recent call last):
File "/home/vm/.cache/pre-commit-zipapp/foYifN81FLsOKwa_AbWJjbHnCx9JWWVug44854ojl8o/python", line 47, in <module>
raise SystemExit(main())
File "/home/vm/.cache/pre-commit-zipapp/foYifN81FLsOKwa_AbWJjbHnCx9JWWVug44854ojl8o/python", line 34, in main
runpy.run_module(args.m, run_name='__main__', alter_sys=True)
File "/usr/lib/python3.10/runpy.py", line 220, in run_module
mod_name, mod_spec, code = _get_module_details(mod_name)
File "/usr/lib/python3.10/runpy.py", line 140, in _get_module_details
raise error("No module named %s" % mod_name)
ImportError: No module named virtualenv
Check the log at /home/vm/.cache/pre-commit/pre-commit.log
pre-commit run -a 0,41s user 0,09s system 105% cpu 0,473 total
Or does it not work with Python 3.10, I dunno ✘3 ➜ virtualenv
usage: virtualenv [--version] [--with-traceback] [-v | -q] [--read-only-app-data] [--app-data APP_DATA] [--reset-app-data] [--upgrade-embed-wheels] [--discovery {builtin}] [-p py] [--try-first-with py_exe]
[--creator {builtin,cpython3-posix,venv}] [--seeder {app-data,pip}] [--no-seed] [--activators comma_sep_list] [--clear] [--no-vcs-ignore] [--system-site-packages] [--symlinks | --copies] [--no-download | --download]
[--extra-search-dir d [d ...]] [--pip version] [--setuptools version] [--wheel version] [--no-pip] [--no-setuptools] [--no-wheel] [--no-periodic-update] [--symlink-app-data] [--prompt prompt] [-h]
dest
virtualenv: error: the following arguments are required: dest
SystemExit: 2 I suppose that we should support everything starting at least from Python 3.8.2, to support Ubuntu 20.04 (Standard EOL - April 2025, Extended EOL - April 2030) https://wiki.ubuntu.com/Releases Or we should pack somehow Python update/venv based on pre-installed tools |
7d10faf
to
c78797f
Compare
I'd recommend supporting whatever the current version of I don't know why you've got an error. I run with this which is the current commit on my PR branch: repos:
- repo: https://github.com/ericfrederich/pre-commit-terraform.git
rev: c78797feb5463a975a19b69dbd3624b9219fe0ab
hooks:
- id: terraform_fmt_py
args:
- --env-vars=FOO="bar"
- --env-vars=SPAM=eggs |
c78797f
to
03ebccc
Compare
03ebccc
to
18291bd
Compare
I believe all review items are now addressed. I think the error @MaxymVlasov had is unrelated to this PR as it was complaining about virtualenv which is what pre-commit uses to create isolated environments for various Python based hooks. I have a feeling there would also be an issue running hooks from the pre-commit-hooks repo on that same system as well. |
|
I don't get how it is possible, but this hook flies like a rocket compared to bash-one. 🔥 It makes sense to reimplement all these as Python. But please, @ericfrederich add tests. Pytest whatever, currently it's a pain in the ass to try to implement anything without breaking stuff... #303 And these checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start a PR scope thread here:
How much do we want to be implemented in this PR?
1.1. Current common
functional "as is"
1.2. Full common
functional parity
2.1. Just terraform_fmt
2.2. All hooks fully which fully based on common
functional
Choose one from 1. and 2.
The current state is 1.1. + 2.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.1
+ 2.1
(+ each hook to be re-implemented with Python in scope of a separate PR)
ps: I don't get the difference between 1.1
and 1.2
to be honest... 🤷🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1.2 Means that
should be implemented too in this PR
What's the approximate difference? |
LogicPython version
Bash versionThe bash version from what I can tell is using From what I understand, this is taking the same list of files from pre-commit, but then doing computation to figure out unique directories, and running So I think if given a large directory of .tf files, say (infra/foo_000.tf, infra/foo_001.tf, ... infra/foo_999.tf)... if pre-commit splits this up into 50 chunks of 20 files the
ArchitectureDisclosure: I have not verified this. Pre-commit is written in Python, so when it uses a Python based hook it may be done without invoking another subprocess (but I could be wrong) Python versionpre-commit process is a Python process which runs the hook within the same process which calls Bash versionpre-commit is a Python process which runs a bash script as a subprocess which runs |
Does this mean, that if we re-work existing shell-hook to run only on changed files, then we will rectify the "slowness" of the shell-hook but introduce behavioral change that Py-based hook implements by default? The fmt all files in each changed dir vs check only changed files. If yes, then @MaxymVlasov why do we need the |
Hmm, I am not sure that it changes only modified files, need to test it. Otherwise, I don't get how it was able to run on All this "per-dir" stuff was introduced to avoid 3-8 times the same runs in one dir by removing filenames and then deduplication of patches. We need to test it for sure. Ideally, by pytest (or similar), not manually |
There should be no code that runs against an entire directory. It should only run against the files given to it by Here's a repo which demonstrates the example: https://github.com/ericfrederich/pct-bug-example Just follow the README. I think if you fix the bug (i.e. remove all the unique directory stuff), you'll get the performance for free. |
I do agree. Though... though there's a trade off between running
That example demonstrates not a bug, but an option, which you think isn't worth to exist and which (if it indeed exists) you simply cut off without backward compatibility in the mind. ps: I might be completely messing up things, though I don't believe that the spoken drastic speedup with Py-based implementation is related to the language switch. It most probably is related to the change in logic. Again: I might be wrong, though I have no playground to make different tests to prove or to disprove 🤷🏻 |
This is indeed a bug that we need to rectify: if the |
Maybe I don't understand this statement or perhaps you don't understand the way the Python code is working. The python version does not call This all works because It could be perhaps that terraform didn't always support multiple arguments. Perhaps it would either accept a single file or a single directory. Perhaps that was the reason these were originally written the way they are.
I apologize, but if a user wants to format an entire directory they can simply call In all cases a pre-commit hook should only run on the files which pre-commit determines it should run against... nothing more, nothing less. Anything else is a bug. If a user wants to run it against every non-excluded file in the entire project, they can run
Let's not conflate all these different topics. The only reason to rewrite into Python is to support environments which don't have Bash (Windows cmd.exe, Power Shell, PyCharm on Windows, VSCode, etc). The fact that the Python version is also more performant and doesn't re-create a bug is beside the point. |
Did I miss a detailed description of the changes this PR brings or you just didn't bother to provide it and let others dive into details on their own? 🤔
So it never passes a
Correct from which point of view: your or those who already rely upon existing logic?
I see your point, though your reasoning is, well, biased and opinionated.
Okay, let's not conflate and refrain from naming a hook with different implementation and different behavior a "version" of an existing one. |
No detailed description needed as it's not over-complicated. It simply calls
I named it as was suggested, I asked how it should be handled and followed instructions. It's currently suffixed with I just need a way to run I'll use this which:
- repo: local
hooks:
- id: terraform_fmt
name: Terraform fmt
description: Rewrites all Terraform configuration files to a canonical format.
entry: terraform fmt
language: system
files: (\.tf|\.tfvars)$
exclude: \.terraform/.*$ In the future if I need some of the advanced features provided by this repo like I tried to help add Windows support, but I'm ready to move on. |
Appreciate your intent, time and effort, though this way of collaboration and implementation "user-wise" isn't what I personally am keen to be part of in this project. I guess If we onboard more Pythonists, who are open for collab, are user-oriented and are long-term project support oriented, we may look into re-implementing existing hooks in Python to provide more convenience for Windows users (as opposed to easier maintenance by vast majority of Windows users as well). |
For what it's worth, on the project I need Windows support on with just 113 It used to take 2.41 seconds, now it takes 0.241. It was never about Python or performance though. It was about adding Windows support. |
I absolutely understand your point. My point is of another kind.
I also would like to emphasize once more: this is my very own opinion and I'm not the person in charge. |
I finally read everything that you discussed here :) Okay, my opinion: If we can make something in Python and it will work faster than in bash - let's do it in Python. I see that this PR little-bit stale, so I'll take care of it. |
There's no evidence (there's no even info on the comparison of execution speed) and hence I believe the Py-based hook is just doing something differently and hence with a different result (or there's a bug in Bash-based hook which we may need to rectify because it may also impact other hooks).
While I'm not opposing to switch to Python (why not Golang then for example? 🤪), I'd like to suggest to consider thoroughly all the benefits and drawbacks before making such decision.
I'm all for this and will try and help reviewing 🤝 |
Yes Bash version does a bug which also impacts performance. There is evidence. Yes Py-based hook is doing something differently... It's not re-implementing the bug present in the Bash version. It's really that simple. I don't know how to explain it differently. |
This had been already discussed and you keep repeating yourself without listening to other person. |
If @MaxymVlasov is up to pick up on getting the job done, we can add this hook as a different one (since it implements different behavior) and let users switch step by step if they decide for speed vs regular, sort of conventional behavior. |
different behavior
We're in agreement that they implement different behavior. The behavior of the Python version is that it's bug free and performs well. The behavior of the Bash one is the opposite. This is though no fault of Bash itself, just the over-engineered buggy implementation. This isn't a language war, I could easily write one just as bad in Python. Other than the Python one being bug-free and faster, can you please articulate the difference? conventional behavior
Is "conventional behavior" that if a user explicitly tells pre-commit in their config to exclude a file... that this Bash hook still formats the file anyway ignoring the user's wishes? Is "conventional behavior" that you end up formatting the same file 20 times causing performance issues? This is where there needs to be agreement:
Do you disagree with any of those 3 points? |
So you keep repeating yourself over and over again without listening to other person. Let me cite myself (please read thoroughly the statement inside parenthesis at the end of the paragraph):
Re the "conventional" term since we're taking this differently in this context: I mean traditional, established behavior that is relied upon by lots of users already. Maybe not because they wanted it, but rather because they used to it. Who knows... You've spoken out. Others — haven't. We can't judge for all of them based on your mileage only. PS: Apart from this fascinating conversation, do you have any plans on completing the work on this PR to make it fit the ecosystem? |
I read your citation the first time and even quoted it and responded to it.
"supposing" or "guestimating" why something was implemented doesn't make it true. It also doesn't matter because even if it were true it just means the user requesting it was wrong and it was a mistake to implement the behavior. There already exist ways for users to run against everything in a directory that doesn't involve pre-commit. I had "supposed" and "guestimated" that the In the end it doesn't matter whose supposition is correct (maybe neither). The fact remains it's still the wrong thing to do in 2024.
What is left to complete? I see 7 resolved threads. |
We're looking at this from different points of view. I already explained that: Note Your take is "switch behavior" (while I do agree that this actually along the way rips off a bug from current hook's implementation, but still switches the behavior). My take is "add this new hook as an option (and let users choose whichever is best for them) and fix the bug in existing hook".
Maybe. TBH I don't understand what you're arguing for. What's your goal?
I see two |
I fixed the mypy and pylint issues. But now on GitHub I don't see the results any more... maybe only once it's approved it'll run the checks again? In any case, the pylint stuff was over-aggressive and complaining about existing code which is not part of this PR. |
Put an
x
into the box if that apply:Description of your changes
Fixes #648
How can we test changes
From a windows PowerShell or cmd.exe (i.e. outside of Git Bash, or any other Bash) in a repo with a
terraform_fmt
hook, runpre-commit run -a
and it will fail with something like/bin/bash: .... : No such file or directory
.Change the hook from
terraform_fmt
toterraform_fmt_v2
and it'll work.I've tested on Linux and Windows. I've made formatting changes to terraform code and verified that it updates the files in place properly. I have verified that with
--args=-check
the files are not updated and the hook fails as expected while showing on stdout the offending files.