Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-86] Add CountingInput as a PTransform #10

Closed
wants to merge 2 commits into from

Conversation

tgroh
Copy link
Member

@tgroh tgroh commented Mar 2, 2016

This transform produces an unbounded PCollection containing longs based
on a CountingSource.

Deprecate methods producing a Source in CountingSource.

@tgroh
Copy link
Member Author

tgroh commented Mar 2, 2016

R: @dhalperi

* Creates a {@link BoundedCountingInput} that will produce the specified number of elements,
* from {@code 0} to {@code numElements - 1}.
*/
public static PTransform<PBegin, PCollection<Long>> upTo(long numElements) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return type -- BoundedCountingInput?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@tgroh
Copy link
Member Author

tgroh commented Mar 3, 2016

I haven't added a checkNotNull to the constructor, as it's private and all of the withers do the assertion directly. Can still add if required.

Other than that, done.

* PTransform<PBegin, PCollection<Long>> producer = CountingInput.unbounded();
* // Or, to create an unbounded source that uses a provided function to set the element timestamp.
* PTransform<PBegin, PCollection<Long>> producer =
* CountingInput.unbounded().withTimestampFn(someFn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mirror the javadoc improvements here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dhalperi
Copy link
Contributor

dhalperi commented Mar 3, 2016

LGTM

@tgroh tgroh force-pushed the counting_source_as_transform branch from 7d1c881 to 91ee2da Compare March 3, 2016 17:45
@tgroh
Copy link
Member Author

tgroh commented Mar 3, 2016

Squashed together all of the commits.

@tgroh tgroh force-pushed the counting_source_as_transform branch from 91ee2da to de45ae3 Compare March 3, 2016 19:49
tgroh added 2 commits March 3, 2016 13:16
This transform produces an unbounded PCollection containing longs based
on a CountingSource.

Deprecate methods producing a Source in CountingSource.
@tgroh tgroh force-pushed the counting_source_as_transform branch from de45ae3 to f041893 Compare March 3, 2016 21:16
@tgroh
Copy link
Member Author

tgroh commented Mar 3, 2016

Closed by 5a7bd80

@tgroh tgroh closed this Mar 3, 2016
davorbonaci added a commit to GoogleCloudPlatform/DataflowJavaSDK that referenced this pull request Mar 4, 2016
cosmoskitten pushed a commit to cosmoskitten/beam that referenced this pull request Apr 10, 2017
aljoscha pushed a commit to aljoscha/beam that referenced this pull request Mar 14, 2018
charlesccychen pushed a commit to cosmoskitten/beam that referenced this pull request Apr 6, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
apache#10 Improve documentation around URI based data-sources/-sinks
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
tvalentyn pushed a commit to tvalentyn/beam that referenced this pull request May 15, 2018
kennknowles pushed a commit that referenced this pull request Oct 16, 2018
pabloem pushed a commit to pabloem/beam that referenced this pull request Feb 13, 2021
* New DebeziumIO class.

* Merge connector code

* DebeziumIO and MySqlConnector integrated.

* Added FormatFuntion param to Read builder on DebeziumIO.

* Added arguments checker to DebeziumIO.

* Add simple JSON mapper object (#1)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

Co-authored-by: osvaldo-salinas <[email protected]>
Co-authored-by: Carlos Dominguez <[email protected]>
Co-authored-by: Carlos Domínguez <[email protected]>

* Add debeziumio tests

* Debeziumio testing json mapper (#3)

* Some code refactors. Use a default DBHistory if not provided

* Add basic tests for Json mapper

* Debeziumio time restriction (apache#5)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

* Some code refactors. Use a default DBHistory if not provided

* Adding based-time restriction

Stop polling after specified amount of time

* Add basic tests for Json mapper

* Adding new restriction

Uses a time-based restriction

* Adding optional restrcition

Uses an optional time-based restriction

Co-authored-by: juanitodread <[email protected]>
Co-authored-by: osvaldo-salinas <[email protected]>

* Upgrade DebeziumIO connector (apache#4)

* Address comments (Change dependencies to testCompile, Set JsonMapper/Coder as default, refactors) (apache#8)

* Revert file

* Change dependencies to testCompile
* Move Counter sample to unit test

* Set JsonMapper as default mapper function
* Set String Coder as default coder when using JsonMapper
* Change logs from info to debug

* Debeziumio javadoc (apache#9)

* Adding javadoc

* Added some titles and examples

* Added SourceRecordJson doc

* Added Basic Connector doc

* Added KafkaSourceConsumer doc

* Javadoc cleanup

* Removing BasicConnector

No usages of this class were found overall

* Editing documentation

* Debeziumio fetched records restriction (apache#10)

* Adding javadoc

* Adding restriction by number of fetched records

Also adding a quick-fix for null value within SourceRecords
Minor fix on both MySQL and PostgreSQL Connectors Tests

* Run either by time or by number of records

* Added DebeziumOffsetTrackerTest

Tests both restrictions: By amount of time and by Number of records

* Removing comment

* DebeziumIO test for DB2. (apache#11)

* DebeziumIO test for DB2.

* DebeziumIO javadoc.

* Clean code:removed commented code lines on DebeziumIOConnectorTest.java

* Clean code:removing unused imports and using readAsJson().

Co-authored-by: Carlos Domínguez <[email protected]>

* Debezium limit records (now configurable) (apache#12)

* Adding javadoc

* Records Limit is now configurable

(It was fixed before)

* Debeziumio dockerize (apache#13)

* Add mysql docker container to tests

* Move debezium mysql integration test to its own file

* Add assertion to verify that the results contains a record.

* Debeziumio readme (apache#15)

* Adding javadoc

* Adding README file

* Add number of records configuration to the DebeziumIO component (apache#16)

* Code refactors (apache#17)

* Remove/ignore null warnings

* Remove DB2 code

* Remove docker dependency in DebeziumIO unit test and max number of recods to MySql integration test

* Change access modifiers accordingly

* Remove incomplete integration tests (Postgres and SqlServer)

* Add experimenal tag

* Debezium testing stoppable consumer (apache#18)

* Add try-catch-finally, stop SourceTask at finally.

* Fix warnings

* stopConsumer and processedRecords local variables removed. UT for task stop use case added

* Fix minor code style issue

Co-authored-by: juanitodread <[email protected]>

* Fix style issues (check, spotlessApply) (apache#19)

Co-authored-by: Osvaldo Salinas <[email protected]>
Co-authored-by: alejandro.maguey <[email protected]>
Co-authored-by: osvaldo-salinas <[email protected]>
Co-authored-by: Carlos Dominguez <[email protected]>
Co-authored-by: Carlos Domínguez <[email protected]>
Co-authored-by: Carlos Domínguez <[email protected]>
Co-authored-by: Alejandro Maguey <[email protected]>
Co-authored-by: Hassan Reyes <[email protected]>
pabloem pushed a commit that referenced this pull request Feb 17, 2021
Debeziumio PoC (#7)

* New DebeziumIO class.

* Merge connector code

* DebeziumIO and MySqlConnector integrated.

* Added FormatFuntion param to Read builder on DebeziumIO.

* Added arguments checker to DebeziumIO.

* Add simple JSON mapper object (#1)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

Co-authored-by: osvaldo-salinas <[email protected]>
Co-authored-by: Carlos Dominguez <[email protected]>
Co-authored-by: Carlos Domínguez <[email protected]>

* Add debeziumio tests

* Debeziumio testing json mapper (#3)

* Some code refactors. Use a default DBHistory if not provided

* Add basic tests for Json mapper

* Debeziumio time restriction (#5)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

* Some code refactors. Use a default DBHistory if not provided

* Adding based-time restriction

Stop polling after specified amount of time

* Add basic tests for Json mapper

* Adding new restriction

Uses a time-based restriction

* Adding optional restrcition

Uses an optional time-based restriction

Co-authored-by: juanitodread <[email protected]>
Co-authored-by: osvaldo-salinas <[email protected]>

* Upgrade DebeziumIO connector (#4)

* Address comments (Change dependencies to testCompile, Set JsonMapper/Coder as default, refactors) (#8)

* Revert file

* Change dependencies to testCompile
* Move Counter sample to unit test

* Set JsonMapper as default mapper function
* Set String Coder as default coder when using JsonMapper
* Change logs from info to debug

* Debeziumio javadoc (#9)

* Adding javadoc

* Added some titles and examples

* Added SourceRecordJson doc

* Added Basic Connector doc

* Added KafkaSourceConsumer doc

* Javadoc cleanup

* Removing BasicConnector

No usages of this class were found overall

* Editing documentation

* Debeziumio fetched records restriction (#10)

* Adding javadoc

* Adding restriction by number of fetched records

Also adding a quick-fix for null value within SourceRecords
Minor fix on both MySQL and PostgreSQL Connectors Tests

* Run either by time or by number of records

* Added DebeziumOffsetTrackerTest

Tests both restrictions: By amount of time and by Number of records

* Removing comment

* DebeziumIO test for DB2. (#11)

* DebeziumIO test for DB2.

* DebeziumIO javadoc.

* Clean code:removed commented code lines on DebeziumIOConnectorTest.java

* Clean code:removing unused imports and using readAsJson().

Co-authored-by: Carlos Domínguez <[email protected]>

* Debezium limit records (now configurable) (#12)

* Adding javadoc

* Records Limit is now configurable

(It was fixed before)

* Debeziumio dockerize (#13)

* Add mysql docker container to tests

* Move debezium mysql integration test to its own file

* Add assertion to verify that the results contains a record.

* Debeziumio readme (#15)

* Adding javadoc

* Adding README file

* Add number of records configuration to the DebeziumIO component (#16)

* Code refactors (#17)

* Remove/ignore null warnings

* Remove DB2 code

* Remove docker dependency in DebeziumIO unit test and max number of recods to MySql integration test

* Change access modifiers accordingly

* Remove incomplete integration tests (Postgres and SqlServer)

* Add experimenal tag

* Debezium testing stoppable consumer (#18)

* Add try-catch-finally, stop SourceTask at finally.

* Fix warnings

* stopConsumer and processedRecords local variables removed. UT for task stop use case added

* Fix minor code style issue

Co-authored-by: juanitodread <[email protected]>

* Fix style issues (check, spotlessApply) (#19)

Co-authored-by: Osvaldo Salinas <[email protected]>
Co-authored-by: alejandro.maguey <[email protected]>
Co-authored-by: osvaldo-salinas <[email protected]>
Co-authored-by: Carlos Dominguez <[email protected]>
Co-authored-by: Carlos Domínguez <[email protected]>
Co-authored-by: Carlos Domínguez <[email protected]>
Co-authored-by: Alejandro Maguey <[email protected]>
Co-authored-by: Hassan Reyes <[email protected]>

Add missing apache license to README.md

Enabling integration test for DebeziumIO (#20)

Rename connector package cdc=>debezium. Update doc references (#21)

Fix code style on DebeziumIOMySqlConnectorIT
usingh83 added a commit to usingh83/beam that referenced this pull request May 7, 2021
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly
usingh83 added a commit to usingh83/beam that referenced this pull request May 13, 2021
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#2:

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

Final Commit with all changes

Added unit test

adding examples for usage

usage for TwitterIO added and Java PreCommit failure fix

Spotless PreCommit failure fix
pabloem pushed a commit that referenced this pull request May 18, 2021
…eams data from twitter

* # This is a combination of 2 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

* # This is a combination of 2 commits.
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #2:

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

Final Commit with all changes

Added unit test

adding examples for usage

usage for TwitterIO added and Java PreCommit failure fix

Spotless PreCommit failure fix

* Unit test for multiple config added, and beautification

* Spotless apply fixed

* Removing redundant comments

* Removing newly added test

* adding newly added test back
ajothomas referenced this pull request in ajothomas/beam Oct 12, 2021
Merge from Master to li_trunk
robertwb added a commit to robertwb/incubator-beam that referenced this pull request Jan 8, 2022
hengfengli referenced this pull request in hengfengli/beam Mar 21, 2022
* feat: skeleton implementation of read fn

Adds skeleton implementation of the ReadChangeStreamPartitionDoFn with a
simple test.

* feat: adds initial impl for read component

Adds initial implementation for read change stream partition component,
along with an initial test and a test plan.

* test: adds unit test for record mapping

Adds unit test for mapping a struct to a data changes record.

* refactor: refactor tests

* feat: set the correct bounds for consuming SDF

Initialises the offset range based on the start and end timestamp of the
partition record.

* feat: process heartbeat records

Adds heartbeat record processing to the read change stream partition
dofn. The only thing necessary here is to update the watermark.

* chore: add package info to dao and mapper packages

* feat: add simple implementation on child partition

Adds the first implementation for child partitions, which just updates
the watermark based on the start timestamp of the record.

* feat: updates metadata table on child split

* feat: add change stream restriction tracker

Creates custom implementation of change stream restriction tracker which
tracks the timestamp position as well as the mode that should be
executed.

* refactor: add dao impls for change streams

Adds method to run change stream queries and update the partition
metadata table within a transaction.

* test: narrows unit tests for change stream fn

Unit tests only the process element method from the dofn. This way we
can verify updates to the restriction tracker and watermark estimator.

* test: verify tracker / watermark in tests

Validates that the restriction tracker and watermark were used to update
the correct state in the change streams unit tests.

* feat: add DELETE_PARTITION mode and refactor test

* test: complete unit test for child partition split

Adds assertions for waiting for children, waiting for parents and
deleting the current partition.

* feat: adds possible state transition

From partition query to wait for parents, since in all cases we will
need to wait for parents before terminating.

* feat: update termination of data / hearbeat record

Waits for parents to be deleted before terminating after a data /
hearbeat records.
On termination, marks itself as finished and deletes itself from the
metadata table.

* feat: moves the finishing of a partition state

To after a result set is consumed. This was not the case for the child
partitions, which finished before waiting for the children.

* refactor: refactors split case

* chore: removes unused file

* refactor: decompose read cdc dofn

Decomposes read change stream partition do fn into several actions.

* feat: process child partitions independently

Processes each record within a child partition record independently.

* feat: implement merge all parents finished

Implements child partition record case for merge when all parents are
finished.

* test: add test for merge one parent not finished

* test: refactor tests for read dofn

* tests: test termination case in read dofn

* feat: execute based on mode in read dofn

Does different things based on the partition mode that is currently
saved in the restriction.

* refactor: organize partition metadata dao

* refactor: refactors tests

* feat: implements change stream dao

* fix: fix restriction tracker to update correctly

The restriction tracker was not updating the restriction on each
tryClaim call. This commit fixes this behaviour.

* refactor: minor refactor in read dofn

* refactor: undo modifications to SpannerConfig

* chore: removes solved TODOs

* chore: fixes build violations

* docs: adds a few comments / todos

* feat: handles initial partition

Handles the initial partition which should be mapped to null in the
change stream query.

* fix: fixes the record mapping

Fixes the record mapping according to what is being returned by the
change stream query.

* feat: parameterises change stream query

Uses named params and parameterises the change stream query.

* docs: adds state graph for dofn

Adds documentation for the state graph on the Read Change Stream
Partition DoFn.

* docs: add TODOs / FIXMEs for missing functionality

* Remove linting (#14)

* feat: adds it test for read dofn

Here we had to change several things:

- We had to fix the restriction tracker checkDone() method, not to
  require the done partition mode. This is because this method gets
  called when we resume from a previous mode as well.
- We had to fix the heartbeat parameter for the change stream dao. It is
  now specified as millis instead of seconds.
- We added debug logging to all the actions.
- We fixed a bug in the restriction tracker allowing the mode change
  from wait_for_child_partitions to finish_partition (it was missing).

Co-authored-by: Zoe <[email protected]>
sjvanrossum pushed a commit to sjvanrossum/beam that referenced this pull request May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants