You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Motivation: Beam is currently on Parquet 0.12.0. The latest Parquet version, 0.13.1, contains bugfixes and new features, such as the In Predicate, and built-in logical type support for the parquet-avro binding. See more here: apache/parquet-java@apache-parquet-1.12.0...apache:apache-parquet-1.13.1
Risks: Parquet 0.13.1 contains a few dependency bumps, including:
Hadoop 2.10.1 -> 3.2.3
Jackson 2.11.4 -> 2.13.4
Avro 1.10.1 -> 1.11.1
From what I understand, the Avro upgrade should be safe now that Beam has modularized AvroIO and its subcomponents. I've also run test Beam/Scio jobs using Hadoop 3 without issues.
Are there any other blockers to upgrading Parquet? I remember about a year ago there was a blocker regarding the gcs-connector library upgrade, but that seems to have been resolved and all bigdataoss libraries are up to date.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam YAML
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
What needs to happen?
Motivation: Beam is currently on Parquet 0.12.0. The latest Parquet version, 0.13.1, contains bugfixes and new features, such as the
In
Predicate, and built-in logical type support for the parquet-avro binding. See more here: apache/parquet-java@apache-parquet-1.12.0...apache:apache-parquet-1.13.1Risks: Parquet 0.13.1 contains a few dependency bumps, including:
From what I understand, the Avro upgrade should be safe now that Beam has modularized AvroIO and its subcomponents. I've also run test Beam/Scio jobs using Hadoop 3 without issues.
Are there any other blockers to upgrading Parquet? I remember about a year ago there was a blocker regarding the gcs-connector library upgrade, but that seems to have been resolved and all bigdataoss libraries are up to date.
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: