Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka-14748: Relax non-null FK left-join requirement #14107

Merged
merged 8 commits into from
Dec 6, 2023

Conversation

florin-akermann
Copy link
Contributor

Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@florin-akermann florin-akermann marked this pull request as draft July 26, 2023 21:36
@florin-akermann florin-akermann marked this pull request as ready for review July 26, 2023 21:36
@florin-akermann florin-akermann marked this pull request as draft July 26, 2023 21:40
@florin-akermann florin-akermann marked this pull request as ready for review July 26, 2023 21:44
@mjsax mjsax added the streams label Jul 26, 2023
@florin-akermann florin-akermann marked this pull request as draft August 11, 2023 19:05
@florin-akermann florin-akermann force-pushed the KAFKA-14748 branch 3 times, most recently from 19dfb4b to fdf411c Compare August 17, 2023 21:13
@florin-akermann florin-akermann force-pushed the KAFKA-14748 branch 3 times, most recently from e51c5cc to 47a84fc Compare November 11, 2023 22:56
@florin-akermann florin-akermann marked this pull request as ready for review November 11, 2023 23:00
@wcarlson5
Copy link
Contributor

I'm going to start my review of this today. Hopefully can be actionable sometime tomorrow.

Copy link
Contributor

@wcarlson5 wcarlson5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments

@@ -198,6 +184,21 @@ <h3><a id="streams_api_changes_360" href="#streams_api_changes_360">Streams API
</code>
</pre>
</p>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for remembering the docs update!


private void forward(final Record<K, Change<V>> record, final KO foreignKey, final Instruction deleteKeyNoPropagate) {
final SubscriptionWrapper<K> wrapper = new SubscriptionWrapper<>(
hash(record),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many times is this record rehased?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are branches where it is hashed twice. I puhsed a change to cache the hash value.

instruction = PROPAGATE_ONLY_IF_FK_VAL_AVAILABLE;
}
final KO newForeignKey = foreignKeyExtractor.apply(record.value().newValue);
final KO newForeignKey = record.value().newValue == null ? null : foreignKeyExtractor.apply(record.value().newValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems redundant with line 165. That check the newValue is not null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!, thanks, removed.

@florin-akermann
Copy link
Contributor Author

Just a couple of comments

Thank you @wcarlson5 for the comments.
I pushed according changes.

While I have your attention could we revisit the adjutments done to the optimization 'drop null key records on repartitioning' as part of #14174 (comment).
To me it seems odd to apply the optimization in some cases and not in others.
E.g. as described in the linked comment, there could be this confusing case where null-key records don't get propagated because of a explicit repartition call at the end of the topology.

I was wondering whether it would make sense to adjust the optimization from 'filter null-key records in repartition nodes if no left or outer-join is downstream' to 'filter null-key records in repartition nodes if no left or outer-join node is part of this branch of the topology (aka no left or outer-join is up- or downstream of this repartition node)'

@wcarlson5
Copy link
Contributor

I'm not sure, either option for the optimization is fine with me as long as it's well documented.

I'm good with how the PR is for now. I'm going to merge it to get it in before feature freeze for 3.7

@wcarlson5 wcarlson5 merged commit 4a958c6 into apache:trunk Dec 6, 2023
1 check failed
ex172000 pushed a commit to ex172000/kafka that referenced this pull request Dec 15, 2023
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <[email protected]>
gaurav-narula pushed a commit to gaurav-narula/kafka that referenced this pull request Jan 24, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <[email protected]>
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <[email protected]>
AnatolyPopov pushed a commit to aiven/kafka that referenced this pull request Feb 16, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <[email protected]>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
Relax non-null FK left-join requirement.

Testing Strategy: Inject extractor which returns null on first or second element.

Reviewers: Walker Carlson <[email protected]>
if (oldForeignKey != null && !Arrays.equals(serialize(newForeignKey), serialize(oldForeignKey))) {
forward(record, oldForeignKey, DELETE_KEY_AND_PROPAGATE);
}
forward(record, newForeignKey, PROPAGATE_NULL_IF_NO_FK_VAL_AVAILABLE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@florin-akermann florin-akermann May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mjsax omg, thanks for the flag! Looks like @AyoubOm is addressing it in https://issues.apache.org/jira/browse/KAFKA-16394 already? Else i'll adress it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants