Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO] Pass client builder to debezium database history #11293

Merged
merged 14 commits into from
Aug 20, 2021

Conversation

sijie
Copy link
Member

@sijie sijie commented Jul 13, 2021

an alternative approach for #11251

Copy link
Member

@nlu90 nlu90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for functionality purpose.

I'm just a little concerned about:

  1. the new exposed API.
  2. the security of passing this object string around.

If you can help explain it's secure to do so, that will be great. Specifically, what happens a user gets the ClientBuilder and change certain field while keep some sensitive auth fields the same. Will this cause unexpected access to another pulsarcluser/tenant/topic?

@@ -61,26 +66,34 @@
.withValidation(Field::isRequired);

public static final Field SERVICE_URL = Field.create(CONFIGURATION_FIELD_PREFIX_STRING + "pulsar.service.url")
.withDisplayName("Pulsar broker addresses")
.withDisplayName("Pulsar service url")
.withType(Type.STRING)
.withWidth(Width.LONG)
.withImportance(Importance.HIGH)
.withDescription("Pulsar service url")
.withValidation(Field::isRequired);
Copy link
Member

@nlu90 nlu90 Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can set either this serviceUrl field or the following clientBuilder field. so they might not be required now.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the approach looks good to me, it is a great idea indeed.

One question related to security:
with this change the Function code is able to easily access the Authentication information (because it can Serialize the Builder and then read the credentials).
Before this change (with the previous approach) the Function was not able to access the credentials but only to use the PulsarClient, for instance a JWT token could not be stolen. Now you can deploy the Function (the auth information is set by the Admin who deploys the function, not by the author of the function) and the Function can get the credentials and send them outside the cluster.

is it something we should care about ?

@nlu90
Copy link
Member

nlu90 commented Jul 28, 2021

/pulsarbot run-failure-checks

@sijie sijie changed the title (WIP) [IO] Pass client builder to debezium database history [IO] Pass client builder to debezium database history Jul 28, 2021
@sijie sijie added this to the 2.9.0 milestone Jul 28, 2021
*
* @return the instance of pulsar client builder.
*/
default ClientBuilder getPulsarClientBuilder() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nlu90 You need to implement this method in the sub-classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

working on it

@nlu90
Copy link
Member

nlu90 commented Jul 29, 2021

/pulsarbot run-failure-checks

2 similar comments
@nlu90
Copy link
Member

nlu90 commented Jul 29, 2021

/pulsarbot run-failure-checks

@nlu90
Copy link
Member

nlu90 commented Jul 30, 2021

/pulsarbot run-failure-checks

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the patch again.
I left two comments, PTAL

@nlu90
Copy link
Member

nlu90 commented Aug 2, 2021

/pulsarbot run-failure-checks

1 similar comment
@nlu90
Copy link
Member

nlu90 commented Aug 3, 2021

/pulsarbot run-failure-checks

@nlu90 nlu90 force-pushed the wip_current_context branch from 3ce2769 to 028210f Compare August 3, 2021 20:53
@hangc0276
Copy link
Contributor

/pulsarbot run-failure-checks

@hangc0276
Copy link
Contributor

@eolivelli Cloud you please help review this PR again, thanks.

@eolivelli
Copy link
Contributor

eolivelli commented Aug 5, 2021

There are still some unaddressed comments of mine:

  • do not use commons io base64 but Java util
  • create a utility method to serialize/deserialize the ClientBuilder

@nlu90 nlu90 force-pushed the wip_current_context branch from 028210f to 2f4919d Compare August 10, 2021 03:01
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me

@eolivelli
Copy link
Contributor

integration tests failed, regarding Debezium, please check

testDebeziumMySqlSourceAvro(org.apache.pulsar.tests.integration.io.sources.debezium.PulsarDebeziumSourcesTest)  Time elapsed: 56.866 s  <<< FAILURE!
java.lang.AssertionError: expected [1] but found [0]
	at org.testng.Assert.fail(Assert.java:99)
	at org.testng.Assert.failNotEquals(Assert.java:1037)
	at org.testng.Assert.assertEqualsImpl(Assert.java:140)
	at org.testng.Assert.assertEquals(Assert.java:122)
	at org.testng.Assert.assertEquals(Assert.java:907)
	at org.testng.Assert.assertEquals(Assert.java:917)
	at org.apache.pulsar.tests.integration.io.sources.PulsarIOSourceRunner.getSourceStatus(PulsarIOSourceRunner.java:210)
	at org.apache.pulsar.tests.integration.io.sources.debezium.PulsarIODebeziumSourceRunner.lambda$testSource$0(PulsarIODebeziumSourceRunner.java:76)
	at net.jodah.failsafe.Functions.lambda$toSupplier$8(Functions.java:236)
	at net.jodah.failsafe.Functions.lambda$get$0(Functions.java:47)
	at net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.jav

@hangc0276
Copy link
Contributor

move to 2.8.2.

@eolivelli
Copy link
Contributor

@hangc0276 this is a new API, probably it won't be delivered in 2.8.x. I suggest to remove the 2.8.2 label.

in point releases we should not add new APIs

@nlu90
Copy link
Member

nlu90 commented Aug 11, 2021

/pulsarbot run-failure-checks

@nlu90 nlu90 force-pushed the wip_current_context branch from cc2a4d6 to 04d72d8 Compare August 18, 2021 01:57
@@ -62,12 +62,6 @@
<version>${kafka-client.version}</version>
</dependency>

<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>pulsar-client-original</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very interesting.
it looks like that we do not need to import the pulsar-client-original anymore

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@freeznet freeznet force-pushed the wip_current_context branch from 8b60fe1 to 350950a Compare August 19, 2021 16:08
@freeznet freeznet force-pushed the wip_current_context branch from 350950a to bcbcabf Compare August 20, 2021 01:05
@codelipenghui codelipenghui merged commit 12f6566 into apache:master Aug 20, 2021
codelipenghui pushed a commit that referenced this pull request Jan 25, 2022
Cherry pick #11293 

## Motivation
The Debezium requires pulsar a service URL for history database usage.

In #11056 , the service.url field from PulsarKafkaWorkerConfig is no longer available. And the value is also deleted from multiple yaml config files in this commit. This causes the integration test for Debezium connector to fail.

Based on the Debezium paradigm, all configurations should be passed as strings. There's no easy way to inject a PulsarClient via configuration.

We need to ask user to provide the pulsar url explicitly and probably auth info also.

## Modifications
Make the database.history.pulsar.service.url field required
Add the config value back to example yaml files
Update the integration test config
michaeljmarshall pushed a commit that referenced this pull request Feb 11, 2022
…m connector (#12145) (#14040)

# Conflicts:
#	pulsar-io/debezium/core/src/main/java/org/apache/pulsar/io/debezium/DebeziumSource.java
#	pulsar-io/debezium/core/src/main/java/org/apache/pulsar/io/debezium/PulsarDatabaseHistory.java
#	tests/integration/src/test/java/org/apache/pulsar/tests/integration/io/sources/debezium/DebeziumMySqlSourceTester.java
#	tests/integration/src/test/java/org/apache/pulsar/tests/integration/io/sources/debezium/PulsarDebeziumSourcesTest.java

### Motivation

#11293 allows to passing client builder to debezium database history, but it still requires passing `database.history.pulsar.service.url` as well. With client builder, the `database.history.pulsar.service.url` is not been used anymore. 
This PR fixes the logic and only pass client builder with no `database.history.pulsar.service.url` provided. 

Cherry-pick #12145 into branch-2.8
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants