Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failing Py37 BQ file loads test #23334

Merged
merged 5 commits into from
Oct 6, 2022

Conversation

ahmedabu98
Copy link
Contributor

@ahmedabu98 ahmedabu98 commented Sep 22, 2022

Fixes #23264

In #23012, testing with on_success_matcher is switched out for hamcrest_assert(p, bq_matcher) because it is less noisy and more accurate.

However, PostCommits test with TestDirectRunner. The problem is that TestDirectRunner does not run the pipeline without the on_success_matcher argument.

Error results because hamcrest_assert queries a table that the pipeline is supposed to create. Pipeline doesn't run so results in 404 error.

@codecov
Copy link

codecov bot commented Sep 22, 2022

Codecov Report

Merging #23334 (a123490) into master (9287b92) will increase coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #23334      +/-   ##
==========================================
+ Coverage   73.45%   73.49%   +0.04%     
==========================================
  Files         718      718              
  Lines       95526    95782     +256     
==========================================
+ Hits        70170    70397     +227     
- Misses      24060    24089      +29     
  Partials     1296     1296              
Flag Coverage Δ
python 83.20% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...s/interactive/dataproc/dataproc_cluster_manager.py 71.72% <0.00%> (-5.70%) ⬇️
.../python/apache_beam/testing/test_stream_service.py 88.09% <0.00%> (-4.77%) ⬇️
sdks/python/apache_beam/typehints/schemas.py 91.89% <0.00%> (-1.96%) ⬇️
...che_beam/runners/interactive/interactive_runner.py 90.50% <0.00%> (-0.89%) ⬇️
sdks/python/apache_beam/coders/coders.py 87.36% <0.00%> (-0.65%) ⬇️
...pache_beam/runners/interactive/interactive_beam.py 81.70% <0.00%> (-0.64%) ⬇️
sdks/python/apache_beam/dataframe/io.py 89.06% <0.00%> (-0.47%) ⬇️
...on/apache_beam/runners/dataflow/dataflow_runner.py 82.91% <0.00%> (-0.09%) ⬇️
setup.py 0.00% <0.00%> (ø)
sdks/python/apache_beam/metrics/cells.py 82.11% <0.00%> (ø)
... and 24 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ahmedabu98 ahmedabu98 force-pushed the confirm_bq_fl_streaming_fix branch from bbbada2 to 07e572a Compare September 26, 2022 15:44
@ahmedabu98 ahmedabu98 marked this pull request as ready for review September 27, 2022 20:31
@ahmedabu98
Copy link
Contributor Author

R: @Abacn
postcommits failing for other reasons

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

@Abacn
Copy link
Contributor

Abacn commented Sep 27, 2022

R: @Abacn postcommits failing for other reasons

Python PostCommit timeout possibly related to #23014 (comment)

@Abacn
Copy link
Contributor

Abacn commented Sep 29, 2022

Changes only involve file_load_it_test and the other failed bigquery it tests (json_it) should be irrelevant.

Could you please explain a little of the fix---does it mean #23012 introduced some breaking change, or just the failed test did not configure correctly at the beginning?

@@ -162,6 +162,8 @@ tasks.register("directRunnerIT") {
"apache_beam/io/gcp/bigquery_test.py::BigQueryStreamingInsertTransformIntegrationTests::test_multiple_destinations_transform",
"apache_beam/io/gcp/bigquery_test.py::PubSubBigQueryIT",
"apache_beam/io/gcp/bigquery_file_loads_test.py::BigQueryFileLoadsIT::test_bqfl_streaming",
"apache_beam/io/gcp/bigquery_file_loads_test.py::BigQueryFileLoadsIT::test_bqfl_streaming_with_dynamic_destinations",
"apache_beam/io/gcp/bigquery_file_loads_test.py::BigQueryFileLoadsIT::test_one_job_fails_all_jobs_fail",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood correctly this line should refer to test_bqfl_streaming_with_copy_jobs (see skipped test in https://ci-beam.apache.org/job/beam_PostCommit_Python37_PR/444/testReport/apache_beam.io.gcp.bigquery_file_loads_test/BigQueryFileLoadsIT/)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Good catch

@ahmedabu98
Copy link
Contributor Author

Could you please explain a little of the fix---does it mean #23012 introduced some breaking change, or just the failed test did not configure correctly at the beginning?

No breaking change from #23012 AFAIK. The test here was failing because in #23012 the test was changed to use hamcrest_assert() instead of on_success_matcher. For reasons unknown to me, on_success_matcher is needed when running the test using ./run_integration_test.sh (which is what Postcommits are doing), otherwise the pipeline in the test never runs.

Copy link
Contributor

@Abacn Abacn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ahmedabu98
Copy link
Contributor Author

Updated the PR description, root problem lies in TestDirectRunner not running the pipeline without on_success_matcher. Looking into why.

@ahmedabu98
Copy link
Contributor Author

So looks like TestDirectRunner uses either .wait_until_finish() for batch pipelines or on_success_matcher() for streaming pipelines when it runs. Otherwise, the try block is a noop and the pipeline is canceled in the following finally block.

In other words, it is by design that TestDirectRunner needs on_streaming_matcher() in pipeline options for streaming pipelines.

@johnjcasey
Copy link
Contributor

Got it, that tracks

@Abacn
Copy link
Contributor

Abacn commented Oct 5, 2022

FYI we can temporarily comment out test suites in PostCommit to get results more quickly and avoid burning out quota, similar to this: a3158ea

@github-actions github-actions bot added the build label Oct 5, 2022
@ahmedabu98
Copy link
Contributor Author

Thanks @Abacn

@ahmedabu98
Copy link
Contributor Author

Run Python 3.8 PostCommit

@ahmedabu98
Copy link
Contributor Author

Run Python 3.7 PostCommit

@ahmedabu98
Copy link
Contributor Author

Tests are passing now: Python 3.7 and Python 3.8

@github-actions github-actions bot removed the build label Oct 5, 2022
@ahmedabu98
Copy link
Contributor Author

Run Python PreCommit

@johnjcasey johnjcasey merged commit c8075de into apache:master Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants