Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PingCAP Special Week 2019 Q4: Dumpling / Mydumper replacement for DM integration and Lightning performance #122

Closed
1 of 13 tasks
kennytm opened this issue Dec 6, 2019 · 1 comment

Comments

@kennytm
Copy link
Contributor

kennytm commented Dec 6, 2019

Dumpling / Mydumper replacement for DM integration and Lightning performance

Full RFC at #123.

Abstract

We propose introduce a library to replace Mydumper, code named Dumpling, optimized for TiDB Lightning and to be usable as a library/plugin inside DM and TiDB, as well as be an independent program.

Problem statement

Mydumper is a third-party tool to dump MySQL databases into local filesystem as SQL dump. TiDB Lightning relies on output of Mydumper for importing into TiDB, and DM embeds Mydumper to quickly extract data from upstream.

Using Mydumper in the TiDB ecosystem has the following problems:

  • as a third-party tool, it does not match our development pace
  • Mydumper is licensed in GPLv3, which is not compatible with TiDB (Apache 2.0)

Therefore, we would like to replace Mydumper with our own tool, and develop new features on top of it, like

  • create a custom output format to reduce parsing effort and speed up Lightning
  • support dumping directly to cloud storage

Success criteria

  1. Replacement. Created a Go module with a CLI front-end which supports the Mydumper features required for DM and Lightning.

    • Resulting data are sorted by primary key
    • SQL files are split into size close to the given configuration
    • Single tables can be dumped in parallel, if a primary key or unique btree key exists
    • Consistency: dumping a snapshot instead of live data (either acquire a read lock or ignore new updates)
    • E2E test succeeds
    • Performance matching Mydumper
  2. Extension. Implements features which further helps the ecosystem

    • Support an easy-to-decode output format
    • Support directly writing to cloud storage

TODO list

Phase 1: Essentials — 8000 points total

Phase 2: Features — 2000 points total

Difficulty

  • (Mixed)

Score

  • 10000

Mentor(s)

Recommended skills

  • Go language
  • Software architecture (structuring and API design)
  • Task scheduling strategy for parallel programs
@tisonkun
Copy link
Contributor

closed as passed.

rleungx pushed a commit to rleungx/community that referenced this issue Mar 17, 2022
* gov: projects

Signed-off-by: tison <[email protected]>

* gov: raft members is also tikv member on bootstrap

Signed-off-by: tison <[email protected]>

* Update project/pd/project.json

Co-authored-by: 二手掉包工程师 <[email protected]>
Signed-off-by: tison <[email protected]>

* Update project/pd/project.json

Co-authored-by: 二手掉包工程师 <[email protected]>
Signed-off-by: tison <[email protected]>

* Update project/pd/project.json

Co-authored-by: 二手掉包工程师 <[email protected]>
Signed-off-by: tison <[email protected]>

* Update project/pd/project.json

Co-authored-by: 二手掉包工程师 <[email protected]>
Signed-off-by: tison <[email protected]>

* fix: typo

Signed-off-by: tison <[email protected]>

* gov: rust-prometheus

Signed-off-by: tison <[email protected]>

* gov: nolouch, rleungx, lhy1024 as TiKV reviewer

Signed-off-by: tison <[email protected]>

* comments

Signed-off-by: tison <[email protected]>

* gov: propose zhouqiang-cl as committers to projects

Signed-off-by: tison <[email protected]>

* fix: _comment

Signed-off-by: tison <[email protected]>

* fix: typo

Signed-off-by: tison <[email protected]>

* gov: tiancaiamao as an emeritus_committers

Signed-off-by: tison <[email protected]>

* gov: bobotu as an emeritus_committers

Signed-off-by: tison <[email protected]>

* gov: rename project to team

Signed-off-by: tison <[email protected]>

* fix: folder as 's'

Signed-off-by: tison <[email protected]>

* gov: innerr as tikv maintainer

Signed-off-by: tison <[email protected]>

Co-authored-by: 二手掉包工程师 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants