Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cdc): auto schema change for mysql cdc #17876

Merged
merged 93 commits into from
Aug 20, 2024
Merged

Conversation

StrikeW
Copy link
Contributor

@StrikeW StrikeW commented Jul 31, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR finished the end to end process of auto schema change for mysql:

  • Source parser sends the schema change event to source executor, and waiting for notification from the executor
  • Source executor sends the new version table schema to Meta
  • Meta will randomly pick a frontend node and send RPC to get the ReplaceTablePlan for the table
  • Meta initiates and finishes the schema change procedure and response to the source executor
  • Source executor notify the completion of schema change to parser, so that parser can continuous parsing following messages
  • Add auto.schema.change to source option to allow user enable the feature

TODO:
- Add a WITH option to allow user enable this feature on demand
- License check

  • integration test
  • Fill the newly add cdc_table_name field in Table catalog
  • Note down the failed schema event into event log

#17643

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

Support replicating DDL for MySQL CDC source. User can enable the feature by setting auto.schema.change = 'true' to the source option.

  • currently only support ALTER TABLE ADD/DROP column

@StrikeW
Copy link
Contributor Author

StrikeW commented Aug 16, 2024

Hi @hzxa21 @BugenZhao @yezizp2012, PTAL. Let's merge the functionality first and I have some followup to enhance the observability later.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM.

There are also a few follow-ups:

  1. improve observability
  2. handle the race mentioned in this comment
  3. more sophistic test and load test with all kinds of schema change events issued from upstream (including ALTER ADD/DROP COLUMN, ALTER RENAME, CREATE TABLE, DROP TABLE)

1 will be addressed by #18060. Any plans for 2 and 3?

proto/common.proto Outdated Show resolved Hide resolved
src/stream/src/task/stream_manager.rs Outdated Show resolved Hide resolved
src/meta/src/controller/catalog.rs Outdated Show resolved Hide resolved
@StrikeW
Copy link
Contributor Author

StrikeW commented Aug 19, 2024

Rest LGTM.

There are also a few follow-ups:

  1. improve observability
  2. handle the race mentioned in this comment
  3. more sophistic test and load test with all kinds of schema change events issued from upstream (including ALTER ADD/DROP COLUMN, ALTER RENAME, CREATE TABLE, DROP TABLE)

1 will be addressed by #18060. Any plans for 2 and 3?

2 will also be addressed by #18060, which assigns a unique cdc_table_id to Table. More test cases will be added after #18060 get merge.

@StrikeW StrikeW enabled auto-merge August 19, 2024 13:36
@StrikeW
Copy link
Contributor Author

StrikeW commented Aug 20, 2024

After merge main, the risingwave_simulation::integration_tests sink::scale::test_sink_scale case fails. I find that if I revert #18061 in the branch, the case can pass. @BugenZhao could you review this PR again?

@BugenZhao
Copy link
Member

After merge main, the risingwave_simulation::integration_tests sink::scale::test_sink_scale case fails. I find that if I revert #18061 in the branch, the case can pass. @BugenZhao could you review this PR again?

I don't think we can confirm they are related. Any changes on the codebase can affect the randomness or reproducibility of an issue under madsim.

@StrikeW StrikeW disabled auto-merge August 20, 2024 06:35
@StrikeW StrikeW enabled auto-merge August 20, 2024 07:05
@StrikeW StrikeW force-pushed the siyuan/cdc-handle-schema-event branch from 384a41f to e51b6a8 Compare August 20, 2024 07:21
@StrikeW StrikeW disabled auto-merge August 20, 2024 07:22
@StrikeW StrikeW force-pushed the siyuan/cdc-handle-schema-event branch from e51b6a8 to d003d2b Compare August 20, 2024 07:23
@StrikeW StrikeW added this pull request to the merge queue Aug 20, 2024
Merged via the queue into main with commit 26a5721 Aug 20, 2024
31 of 34 checks passed
@StrikeW StrikeW deleted the siyuan/cdc-handle-schema-event branch August 20, 2024 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants