Skip to content
This repository has been archived by the owner on Nov 11, 2022. It is now read-only.

Version 1.5.0

Compare
Choose a tag to compare
@davorbonaci davorbonaci released this 15 Mar 03:09
· 710 commits to master since this release

With this release, we have begun preparing the Dataflow SDK for Java for an eventual move to Apache Beam (incubating). Specifically, we have refactored a number of internal APIs and removed from the SDK classes used only within the worker, which will now be provided by the Google Cloud Dataflow Service during job execution. This refactoring should not affect any user code.

Additionally, the 1.5.0 release includes the following changes:

  • Enabled an indexed side input format for batch pipelines executed on the Google Cloud Dataflow service. Indexed side inputs significantly increase performance for View.asList, View.asMap, View.asMultimap, and any non-globally-windowed PCollectionViews.
  • Upgraded to Protocol Buffers version 3.0.0-beta-1. If you use custom Protocol Buffers, you should recompile them with the corresponding version of the protoc compiler. You can continue using both version 2 and 3 of the Protocol Buffers syntax, and no user pipeline code needs to change.
  • Added ProtoCoder, which is a Coder for Protocol Buffers messages that supports both version 2 and 3 of the Protocol Buffers syntax. This coder can detect when messages can be encoded deterministically. Proto2Coder is now deprecated; we recommend that all users switch to ProtoCoder.
  • Added withoutResultFlattening to BigQueryIO.Read to disable flattening query results when reading from BigQuery.
  • Added BigtableIO, enabling support for reading from and writing to Google Cloud Bigtable.
  • Improved CompressedSource to detect compression format according to the file extension. Added support for reading .gz files that are transparently decompressed by the underlying transport logic.