Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Support loading of TF models with saved weights #25496

Merged
merged 13 commits into from
Feb 22, 2023

Conversation

riteshghorse
Copy link
Contributor

@riteshghorse riteshghorse commented Feb 15, 2023

Adds support for loading tensorflow model from saved weights given a function to create the model.
Follow up to #25368, Closes #25366


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented Feb 15, 2023

Codecov Report

Merging #25496 (deaa4df) into master (921bc7b) will increase coverage by 0.60%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #25496      +/-   ##
==========================================
+ Coverage   72.20%   72.81%   +0.60%     
==========================================
  Files         772      751      -21     
  Lines      102264    99590    -2674     
==========================================
- Hits        73838    72514    -1324     
+ Misses      26992    25715    -1277     
+ Partials     1434     1361      -73     
Flag Coverage Δ
python 81.98% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...xamples/inference/tensorflow_mnist_with_weights.py 0.00% <0.00%> (ø)
...n/apache_beam/ml/inference/tensorflow_inference.py 0.00% <0.00%> (ø)
...ks/go/pkg/beam/runners/dataflow/dataflowlib/job.go 24.59% <0.00%> (-4.71%) ⬇️
sdks/go/pkg/beam/io/filesystem/gcs/gcs.go 7.29% <0.00%> (-4.17%) ⬇️
sdks/go/pkg/beam/core/runtime/exec/sdf_invokers.go 72.14% <0.00%> (-4.17%) ⬇️
sdks/python/apache_beam/utils/interactive_utils.py 95.12% <0.00%> (-2.44%) ⬇️
sdks/go/pkg/beam/core/funcx/output.go 85.71% <0.00%> (-1.25%) ⬇️
sdks/go/pkg/beam/io/parquetio/parquetio.go 59.55% <0.00%> (-0.89%) ⬇️
.../apache_beam/runners/direct/transform_evaluator.py 89.57% <0.00%> (-0.76%) ⬇️
sdks/go/pkg/beam/core/graph/fn.go 84.40% <0.00%> (-0.71%) ⬇️
... and 35 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@riteshghorse riteshghorse marked this pull request as ready for review February 16, 2023 03:14
@riteshghorse riteshghorse marked this pull request as draft February 16, 2023 03:15
@riteshghorse riteshghorse marked this pull request as ready for review February 16, 2023 15:27
@riteshghorse riteshghorse changed the title [Python] Support loading models with saved weights [Python] Support loading of TF models with saved weights Feb 16, 2023
@riteshghorse
Copy link
Contributor Author

Python_PVR_Flink failed because of licensing issue. Bruno has a PR at #25515

@riteshghorse
Copy link
Contributor Author

R: @damccorm @jrmccluskey

@riteshghorse
Copy link
Contributor Author

Run Python_PVR_Flink PreCommit

@damccorm
Copy link
Contributor

Run Python 3.8 PostCommit

@riteshghorse
Copy link
Contributor Author

Run Python 3.8 PostCommit

@damccorm
Copy link
Contributor

I think you need to pull in master to avoid postcommit failures (fixed by #25446)

@riteshghorse
Copy link
Contributor Author

Run Python 3.8 PostCommit

@riteshghorse
Copy link
Contributor Author

Run Python_Dataframes PreCommit

@riteshghorse
Copy link
Contributor Author

Run Whitespace PreCommit

damccorm added a commit that referenced this pull request Feb 17, 2023
@riteshghorse
Copy link
Contributor Author

Run Python 3.8 PostCommit

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @tvalentyn for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@@ -286,6 +286,7 @@ def get_portability_package_data():
'pytest-xdist>=2.5.0,<3',
'pytest-timeout>=2.1.0,<3',
'scikit-learn>=0.20.0',
'tensorflow>=1.0.0',
Copy link
Contributor

@tvalentyn tvalentyn Feb 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(drive-by comment): let's not depend on a large dependency like this by default. you can see how we deal with sklearn/pytorch in similar circumstances.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @tvalentyn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented at #25547 (comment).

We can follow the approach of sklearn/torch/tensorRT gradle tests and not add very large ML dependencies to the setup.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see - so this was running on 3.7 and 3.9, just not 3.8 because of the difference in dependencies here -

dependsOn(":sdks:python:test-suites:direct:py37:inferencePostCommitIT")

Thanks for clarifying. I'm now +1 on removing this and manually triggering Python 3.7 postcommits, sorry for my confusion @riteshghorse


A couple of follow up questions for @AnandInguva and @tvalentyn :

  1. Is there any reason we only run the dataflow versions on 3.8? Since TensorRT is on Dataflow, we now need to run both the 3.7 and 3.8 postcommits to fully exercise our inference code base
  2. Is there any reason we're not running most of our inference postcommits on dataflow (as well as the direct runner)? That's probably where most of our usage is today

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. TensoRT tests run on Python 3.8 because of the container image provided by TensorRT folks contains 3.8. We need to have an automated process of building image from Dockerfile for Python 3.7, 3.9, 3.10 and then use that image for the tests in Dataflow.
  2. Initially, these tests are very light weight IT tests, so we just run on the DirectRunner. apart from that, I don't see any reason why we didn't run on Dataflow. Maybe adding these postcommit tests to Dataflow suite would increase the total time of Dataflow suite tests in the PostCommit suite.

@riteshghorse
Copy link
Contributor Author

Run Python 3.7 PostCommit

@riteshghorse
Copy link
Contributor Author

Run Python Unit Tests

@riteshghorse
Copy link
Contributor Author

Run Python 3.9 PostCommit

@damccorm
Copy link
Contributor

3.7 has passing tensorflow tests. I'll wait until all active suites complete and merge

image

@damccorm
Copy link
Contributor

(or you can merge)

@damccorm
Copy link
Contributor

Run Python_Coverage PreCommit

@damccorm
Copy link
Contributor

(all other checks have passed, they're just not statusing)

@riteshghorse
Copy link
Contributor Author

All checks passed. Merging!

@riteshghorse riteshghorse merged commit 33750c1 into apache:master Feb 22, 2023
ruslan-ikhsan pushed a commit to akvelon/beam that referenced this pull request Mar 10, 2023
* load model with weight

* example

* update test

* update test

* make create model fn optional

* change tf to tensorflow

* add readme and change urls

* fix whitespace

* add doc and changes.md

* add tensorflow dependency

* remove tf dependency
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Implement a native Tensorflow model hander
4 participants