-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Number of Python examples are failing for Flink and Spark on 2.43.0 release branch #23907
Comments
P0 since this is blocking the ongoing Beam release. |
Did these pass on previous release? If so, errors should be bisectable. |
Thanks, @Abacn . The first change affects go sdk only. #23012 looks suspicious. @ahmedabu98 could you please take a look? |
Confirmed that this: Passes on 2.42.0 branch Will also try a revert. |
Fails for commit ac37784 Passes for commit 2d4f61c (one before above). So the culprit seems to be ac37784 @ahmedabu98 can you please take a look ? We can either do a forward fix in the release branch or revert this if we can revert cleanly. To reproduce locally, you can run the command below. ./gradlew :sdks:python:test-suites:portable:py37:flinkExamples |
Seeing a lot of |
Looks like trying to get spark3 job server from spark2 directory: see |
Sorry @Abacn @ahmedabu98 @chamikaramj, my bad! This slipped in #23751 when migrating to the Spark 3 job-server. Fixed it here: #23936 |
Great 👍🏽 that solves the spark issues. Still looking into why those BQ tests are not starting. |
The affected tests have a step that writes to BQ with The errors showing up in Spark examples in @mosche 's fix in #23936 (here) now show the same error: |
Update: this fails when the pipeline writes with Update#2: it's actually not the BigqueryMatcher, but it's when test args are passed into the Pipeline() instantiation (eg here) |
The error is caused by this This is not an issue for DirectRunner and DataflowRunner, but is caught by Flink and Spark. |
Hmm, that's strange. I don't think Flatten requires the input PCollections to be non-empty but there might be an existing Flink/Spark bug here. |
#23954 is a workaround. The relevant BQ tests in that PR are passing now, though there are other tests in Flink and Spark example suites that are failing relatively recently (just a day ago). |
What happened?
Seems like following Python examples are failing for Flink and Spark on 2.43.0 release branch.
For example,
https://ci-beam.apache.org/job/beam_PostCommit_Python_Examples_Spark_PR/14/
https://ci-beam.apache.org/job/beam_PostCommit_Python_Examples_Flink_PR/12/
For bigquery_tornadoes, the error is following.
RuntimeError: Pipeline BeamApp-jenkins-1028002928-4325f8c5_d8258ff0-6727-4335-867f-56bc846d9f3e failed in state FAILED: java.lang.IllegalArgumentException: PCollectionNodes [PCollectionNode{id=ref_PCollection_PCollection_52, PCollection=unique_name: "61Write/BigQueryBatchFileLoads/TriggerLoadJobsWithoutTempTables.None"
coder_id: "ref_Coder_FastPrimitivesCoder_3"
is_bounded: BOUNDED
windowing_strategy_id: "ref_Windowing_Windowing_1"
}] were consumed but never produced
I found #21300 that probably explain some example failures but this doesn't explain all failures above.
Valentyn, are there any known issues that explain these failures ?
Issue Priority
Priority: 0
Issue Component
Component: examples-python
The text was updated successfully, but these errors were encountered: