-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable back iterative development of latest providers with old airflows #43617
Merged
potiuk
merged 3 commits into
apache:main
from
potiuk:fix-iterative-testing-of-compatibility-with-providers
Nov 4, 2024
Merged
Enable back iterative development of latest providers with old airflows #43617
potiuk
merged 3 commits into
apache:main
from
potiuk:fix-iterative-testing-of-compatibility-with-providers
Nov 4, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The compatibility tests in CI are using providers built as packages from sources, so the compatibility tests run there using "providers/tests" work just fine, because all providers are installed in the airflow.providers site library. However when we are iterating and debugging backwards compatiblity provider tests, we should be able to use local provider sources, rather than installed packages and we have the possibility of mounting both - providers sources and tests to the image. See `contributing-docs/testing/unit_tests.rst` on how to do it by using ``--mount-sources providers-and-tests`` flag connected with `--use-airflow-version`. However as of apache#42505 this has been broken, because currently in main we rely on airflow having "pkgutil" namespace package for both - airflow, and airflow.providers packages (previous airflow versions had implicit package for airflow.providers package) - so providers installed locally cannot be used as "another" source of providers. Previously it was working because both "installed" and "sources" `airflow.providers` package were implicit namespace packages. As explained in https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages > Every distribution that uses the namespace package must include such > an `__init__.py`. If any distribution does not, it will cause the > namespace logic to fail and the other sub-packages will not be > importable. Any additional code in __init__.py will be inaccessible. So because old airflow uses implicit provider's packages and main airflow from source uses "explicit" provider's package, the only way we can make the "source" providers is to mount them or symbolically link them to inside installed distribution of airflow package (in site directory) (or dynamically remove the __init__.py from provider's source directory. We cannot mount the provider package sources ot inside the installed airflow - because when --use-airflow-version is used, airflow is installed dynamically inside the container - after the container is started. This PR solves the problem by adding an env variable that will make the initialization script to remove the installed airflow.providers folder after installing airflow and linking the "providers/src/airflow/providers" folder there. This has the added benefit that all providers (including the preinstalled ones) are used from "main" sources rather than from installed packages - which was problematic for the past way of using providers from sources - which used the fact that both "airflow.providers" in the site-library and the one in sources were implicit namespace packages.
Very interesting :). I've learned a bit more about namespace packages. |
Found it while working on #43556 |
Co-authored-by: GPK <[email protected]>
Co-authored-by: GPK <[email protected]>
Definitely this is a great catch @potiuk :) Thank you.. |
gopidesupavan
approved these changes
Nov 4, 2024
eladkal
approved these changes
Nov 4, 2024
potiuk
deleted the
fix-iterative-testing-of-compatibility-with-providers
branch
November 4, 2024 12:01
ellisms
pushed a commit
to ellisms/airflow
that referenced
this pull request
Nov 13, 2024
…ws (apache#43617) * Enable back iterative development of latest providers with old airflows The compatibility tests in CI are using providers built as packages from sources, so the compatibility tests run there using "providers/tests" work just fine, because all providers are installed in the airflow.providers site library. However when we are iterating and debugging backwards compatiblity provider tests, we should be able to use local provider sources, rather than installed packages and we have the possibility of mounting both - providers sources and tests to the image. See `contributing-docs/testing/unit_tests.rst` on how to do it by using ``--mount-sources providers-and-tests`` flag connected with `--use-airflow-version`. However as of apache#42505 this has been broken, because currently in main we rely on airflow having "pkgutil" namespace package for both - airflow, and airflow.providers packages (previous airflow versions had implicit package for airflow.providers package) - so providers installed locally cannot be used as "another" source of providers. Previously it was working because both "installed" and "sources" `airflow.providers` package were implicit namespace packages. As explained in https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages > Every distribution that uses the namespace package must include such > an `__init__.py`. If any distribution does not, it will cause the > namespace logic to fail and the other sub-packages will not be > importable. Any additional code in __init__.py will be inaccessible. So because old airflow uses implicit provider's packages and main airflow from source uses "explicit" provider's package, the only way we can make the "source" providers is to mount them or symbolically link them to inside installed distribution of airflow package (in site directory) (or dynamically remove the __init__.py from provider's source directory. We cannot mount the provider package sources ot inside the installed airflow - because when --use-airflow-version is used, airflow is installed dynamically inside the container - after the container is started. This PR solves the problem by adding an env variable that will make the initialization script to remove the installed airflow.providers folder after installing airflow and linking the "providers/src/airflow/providers" folder there. This has the added benefit that all providers (including the preinstalled ones) are used from "main" sources rather than from installed packages - which was problematic for the past way of using providers from sources - which used the fact that both "airflow.providers" in the site-library and the one in sources were implicit namespace packages. * Update Dockerfile.ci Co-authored-by: GPK <[email protected]> * Update scripts/docker/entrypoint_ci.sh Co-authored-by: GPK <[email protected]> --------- Co-authored-by: GPK <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The compatibility tests in CI are using providers built as packages from sources, so the compatibility tests run there using "providers/tests" work just fine, because all providers are installed in the airflow.providers site library. However when we are iterating and debugging backwards compatiblity provider tests, we should be able to use local provider sources, rather than installed packages and we have the possibility of mounting both - providers sources and tests to the image.
See
contributing-docs/testing/unit_tests.rst
on how to do it by using--mount-sources providers-and-tests
flag connected with--use-airflow-version
.However as of #42505 this has been broken, because currently in main we rely on airflow having "pkgutil" namespace package for both - airflow, and airflow.providers packages (previous airflow versions had implicit package for airflow.providers package) - so providers installed locally cannot be used as "another" source of providers. Previously it was working because both "installed" and "sources"
airflow.providers
package were implicit namespace packages.As explained in https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages
So because old airflow uses implicit provider's packages and main airflow from source uses "explicit" provider's package, the only way we can make the "source" providers is to mount them or symbolically link them to inside installed distribution of airflow package (in site directory) (or dynamically remove the init.py from provider's source directory.
We cannot mount the provider package sources ot inside the installed airflow - because when --use-airflow-version is used, airflow is installed dynamically inside the container - after the container is started.
This PR solves the problem by adding an env variable that will make the initialization script to remove the installed airflow.providers folder after installing airflow and linking the "providers/src/airflow/providers" folder there. This has the added benefit that all providers (including the preinstalled ones) are used from "main" sources rather than from installed packages - which was problematic for the past way of using providers from sources - which used the fact that both "airflow.providers" in the site-library and the one in sources were implicit namespace packages.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.