-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: add dumpling, a data exporting tool #123
Conversation
### Name | ||
|
||
The initial motivation of this tool is to supplement Lightning. | ||
We call the new tool "Dumpling" as a portmanteau of "dump" + "Lightning". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TiDumpling might help with both searchability and understanding. There are already open source projects named dumpling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard 🙃. We could also specify the official name as TiDB Dumpling (like TiDB Lightning and TiDB Binlog).
|
||
### Programming language | ||
|
||
We'd like to embed Dumpling into TiDB (as an `EXPORT` statement) and DM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the new physical backup is released, it seems that the only use case for using this against TiDB is to export data to another database (MySQL). In the case of an immediate restore into MySQL the export from TiDB probably won't add convenience because one already has to run an external tool to load the data into MySQL.
The TiDB that runs this might essentially need to be considered offline if the backup process is using up all its resources. So then the value proposition would then be that it is easier to deploy an additional TiDB than to deploy a new tool. This will only be the case if the resource requirements of backup are the same as TiDB. If the resource requirements are bigger, then this won't work. If the resource requirements are smaller and the TiDB node will still serve requests, wrapping as subprocess could still be a good idea to better isolate the backup workload.
Unintelligent load balancing between TiDB could easily lead to the TiDB doing backup to get over-worked.
In contrast, the new physical BR tool ran from inside TiDB would only perform meta operations from TiDB with most of the actual backup work done from TiKV: this should leave most resources still available for the TiDB node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph explains why we choose Go not other languages. Even if we don't want EXPORT
, we still need integration with DM.
And given that we're going to have IMPORT
with Lightning, it is natural to support EXPORT
as well.
The EXPORT
statement is not meant to replace BR. BR will be given their own BACKUP
and RESTORE
statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import with lightning will have some similarly deployment issues since it will use a great deal of CPU. However one of the main use cases is to import when a cluster is first created and no useful queries can be run untill import is complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. The IMPORT
and EXPORT
statements allow DBAs to manage logical backups via the SQL interface for familiarity. The individual executables are still available though.
Anyway these are getting off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This RFC is written for the PingCAP Special Week 2019 Q4 ("Tools Matter") item "MySQL Full-Export Tool Dumpling (replacing Mydumper), Integrating to DM". Tracking issue is #122.
🖼 Rendered