Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-378] integrate setuptools in Maven build #537

Closed
wants to merge 81 commits into from
Closed

[BEAM-378] integrate setuptools in Maven build #537

wants to merge 81 commits into from

Conversation

wikier
Copy link
Member

@wikier wikier commented Jun 27, 2016

This PR provide an initial integration of the Python SDK in the Maven build, relying on the exec-maven-plugin to call setuptools (BEAM-378).

Notice I keep out install on purpose, because I guess it may need discussion.


Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@wikier
Copy link
Member Author

wikier commented Jun 27, 2016

Regarding CI, ASF Jenkins would need some extra configuration; but I guess Travis build wouldn't work until they implement travis-ci/travis-ci#4090.

@davorbonaci
Copy link
Member

Thanks Sergio!

@silviulica, @aaltay, what do you think?

@silviulica
Copy link
Contributor

Thanks Sergio! Any help on doing Maven stuff is more than welcome. We need to keep the tests passing though:

  1. setup.py contains a reference to apache_beam.get_version() this will not work since the file itself is responsible to install the apache_beam package. This is the reason the code in there is loading manually the version.py file and executing it. You will need to do something similar.
  2. Most tests should start with doing a 'pip install apache_beam...' or more likely 'python setup.py test' on the source code. So we should clarify that in the Maven install section. This is probably what you meant by leaving out the install.
  3. keeping the version of a Python package in pom.xml files seems very non-Pythonic to me. If there is precedence for it then so be it but otherwise I would prefer to have it in the version.py file and perhaps reference it (if possible) in the pom.xml. Or keep it in two places (not-ideal but may fly for a while).
  4. Also need to decide on a versioning naming/numbering scheme.
  5. setup.py will have to contain the pom.xml as a package data (and get it installed) because the get_version() function is called at runtime.

I suggest as a way to make progress to evolve this CL into setting up the Maven infrastructure w/o doing changes to the version.py machinery: add the pom.xml file, add setup.py code to install it with the package so get_version() works at runtime, corrections on names.

@aaltay
Copy link
Member

aaltay commented Jun 27, 2016

Hi Sergio,

Regarding CI, ASF Jenkins would need some extra configuration; but I guess Travis build wouldn't work until they implement travis-ci/travis-ci#4090.

Travis should work currently and it is already running Python tests. We do not need the above bug to be fixed. Because python is already installed in every Travis worker image already (https://docs.travis-ci.com/user/ci-environment/#Runtimes) And we also run an apt-get install to install 2.7 version explicitly.

@@ -32,6 +32,8 @@ addons:
apt:
packages:
- python2.7
- python-setuptools
- python-dill
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should not require of adding additional packages here. setuptools is already part of the Python distribution for Python 2 >=2.7.9 and setup.py requires (will install) dill.

@wikier
Copy link
Member Author

wikier commented Jun 28, 2016

@silviulica, I've tried to address the version issue in the last commits. The new approach consists on reading the version from pom.xml during setup, and update apache_beam.__version__ for later static usage with no other complex operations at runtime.

Thanks @aaltay, I think CI should be fine now with respect to this PR, but still fails because setup issue of the overall python tool suite. We may need to take a look to it with more time.

return global_names['__version__']
# Reads the actual version from pom.xml file, and synchronizes
# apache_beam.__version__ field for later usage.
def sync_version():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think you need this if the get_version_from_distribution() really works. you can use get_version_from_pom in setup.py and use get_version_from_distribution() in all the other places. This way you avoid the somewhat fragile rewriting the _init.py file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is I didn't get get_version_from_distribution() really working out of pip-controlled environments. So, since it's not a pure setuptools feature, I'm more think about removing such function.

@silviulica
Copy link
Contributor

Hi Sergio! We are getting close here. Thanks very much for your effort. Besides the comments in line can you please explain what is the purpose of the pom.xml? Perhaps as a comment in the pom.xml itself. It is needed to integrate with Apache build/test processes but this is where my knowledge ends. It would be nice to have this comment for people to understand why it is needed especially since it is unusual for a Python package.

@wikier wikier changed the title BEAM-378: integrate setuptools in Maven build [BEAM-378] integrate setuptools in Maven build Jun 29, 2016
@wikier
Copy link
Member Author

wikier commented Aug 9, 2016

@aaltay, after merging python-sdk from upstream, we still have the same issue in this branch:

[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (beam-exec-java) on project beam-runners-flink_2.10-examples: The parameters 'mainClass' for goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java are missing or invalid -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java (beam-exec-java) on project beam-runners-flink_2.10-examples: The parameters 'mainClass' for goal org.codehaus.mojo:exec-maven-plugin:1.4.0:java are missing or invalid

I've tried to get rid of the java goal in beam-runners-flink_2.10-examples with the same trick I'm using in beam-sdks-python, but it does not work... Maybe we could involve the guys from #724 (@mxm) to figure out what's the lateral issue here.

@aaltay
Copy link
Member

aaltay commented Aug 9, 2016

@wikier

I got both python-sdk and flink examples running locally. See this commit: aaltay@a1df1f5

It is based on yours. Except that I used to disable execution. I noticed the flink was using this too. And adding an id there broke their disable mechanism. With this both places can use the same way of disabling.

(In this commit I changed 1 place related to maven version to run locally with my old maven installation. That part should not be submitted.)

Could you try this please?

@aaltay
Copy link
Member

aaltay commented Aug 10, 2016

I commented on the RAT issue as well. Hopefully applying both will pass all the tests. Fingers crossed. @wikier Thank you again.

@wikier
Copy link
Member Author

wikier commented Aug 10, 2016

Cool, @aaltay! That trick works better, actually avoiding inheritance in exec-maven-plugin executions previously configure.

Also reverted nose to setup_requires, and properly configured RAT exclusion.

So now the CI system should be, finally, happy 😄

Thanks so much for your time with these last details. I hope the overall idea of this PR would be useful for the podling.

@wikier
Copy link
Member Author

wikier commented Aug 10, 2016

@aaltay I've patch the RAT configuration, but Jenkins reports unknown licenses I'm not able to reproduce locally...

@aaltay
Copy link
Member

aaltay commented Aug 11, 2016

@wikier

The error is because nose-1.3.7-py2.7.egg folder is created directly under python and not under .egg that is why RAT is still complaining about nose related files.

I cloned your branch and make this single line change: 232590e

It fixes the problem. I also made a temporary PR (#814) to test this with the actual thing Jenkins is running, and the tests passed.

I believe if you apply this change, all tests will pass.

@wikier
Copy link
Member Author

wikier commented Aug 11, 2016

Looks like the version of setuptools I have locally installed and the one available in Jenkins differ in that behavior. Thanks for the pointed, @aaltay!

So, finally, this PR is ready for discussion 😄. Should I bring it to dev@beam or better someone from the core team?

@aaltay
Copy link
Member

aaltay commented Aug 11, 2016

LGTM.

R: @dhalperi for a final review.

Thank you @wikier. It is great that your change gives us integration with maven.

That said there are a few issues with the version management part. These should NOT block the merging of this PR. I will work on a follow up PR right after this is merged to address those.

As a note the issues are get_version_from_pom and sync_version are not called from anywhere. And I will fix the code to call them in my PR.

@wikier
Copy link
Member Author

wikier commented Aug 12, 2016

Perfect. Let me know if you need anything else, @dhalperi.

Back to the version issue @aaltay, if you read the thread above, I initially proposed to align versions. Then the feedback I got from @silviulica was that you didn't want to do that for now. So I kept the helper functions I implemented for whenever such issue would be revisited. Anyway I'd handle that in a separate PR, I've created BEAM-547 for that.

@aaltay
Copy link
Member

aaltay commented Aug 12, 2016

@wikier Sounds good. I remember it was discussed earlier in the thread. Let's get this PR merged and let's move the version discussion to the https://issues.apache.org/jira/browse/BEAM-547. Thank you for opening that issue.

asfgit pushed a commit that referenced this pull request Aug 15, 2016
@dhalperi
Copy link
Contributor

Merged manually, please close PR at your leisure @wikier

@aaltay is going to fixup some things I broke during the merge. But it should be okay.

Thanks again Sergio!

@wikier
Copy link
Member Author

wikier commented Aug 16, 2016

Ah ok, merged in 76de610. You are welcome ;-)

@wikier wikier closed this Aug 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants