-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BR restore txn report ErrRestoreInvalidRange error, 'startKey > endKey, endKey 0000000000000000f7', when restore region's endKey is "" #56228
Comments
Please use LTS version |
I guess the problem is caused by tidb/br/pkg/task/restore_txn.go Line 91 in 0e7c06c
Here |
PTAL @Leavrth |
thanks for your suggestion, We will upgrade as soon as we can.
I commented out this line and things work fine for me, it successfully restored the txn data. But I am not know the potential negative impact of removing the split code. Could there be any adverse effects? |
This line will split regions to avoid region's data size too large. So if this line is commented, the region won't be split, and the data will be restored into one region. If the restored data size is not large, you can wait until regions are split automatically. It's better to skip the empty tidb/br/pkg/task/restore_raw.go Lines 174 to 180 in 9785cdd
|
Hi @SonglinLife do you have time to fix this problem? |
Really sorry for the late reply. Yes, I do want to resolve this bug. Does it only need to ignore the last key? Recently, I dug into the BR project and read the code, and it is hard for me to understand it. I also see some other bugs in the BR project, like it retries backup but doesn't reset the progress bar. Line 355 in 119e765
It really confused me at the beginning because I saw the backup progress bar reach 100% but it didn't stop the backup(it start a new round). And I also struggle to figure out why the BR backup restarts rounds infinitely (starting 5 rounds). Can you give me some hints? I found in the BR code that it checks if there is an incomplete range. If none, then the main loop will stop. Lines 216 to 221 in 119e765
before start backup, It get range information by listdb. Lines 750 to 755 in 9dff38b
But I use the TiKV and PD only, not with TiDB. So the first incomplete range is <"", ""> . And the BR tree data structure will fill the incomplete range when TiKV backs up a region successfully. So if the first successful backup region is <a,b> , then the incomplete range will be <"", a> and <b, ""> .
If TiKV have some gap between two adjacent regions, like I am a totally new user of TiDB, and it is really hard without your guys help. I do really want to improve the BR tools. |
Only ignore the empty key (zero-length key). BR wants to split regions based on the For rest problem, please open separate issues for them. |
thinks for your reply, before I open a new issue I will read code file to understand more detail. In practice, we force br backup txn stop at round 2 and retore txn, it work fine on a prod tikv cluster. but there must be some thing unusual. yes, lets discuss in another issue. and I also open a new request, for this issue base on this discussion. |
Yes, we use each range's end key as the new regions' boundary.
Actually, it retry based on the result of backed up ranges. In each round, it filter out the complete ranges. Lines 214 to 222 in 8bacf9c
The progress bar is imprecise -- approximate number of regions.
Some incomplete ranges are still not backed up. Maybe there is a lock or other reasons.
The region can be generated by
Therefore, there must be the region |
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
2. What did you expect to see? (Required)
restore txn always success.
3. What did you see instead (Required)
restore txn failed with error, report startKey > endKey, endKey was
0000000000000000f7
. Due to tikv encode rule, empty byte slice will encode as0000000000000000f7
. And also I checked my tikv cluster, it did have a region, which endKey was""
.Same Issue also seen in
Restore txn kv fails and reports ErrRestoreInvalidRange
#52574 , although this issue was resolved by pr 80d4dec.but I find the br function
SplitKeysAndScatter
inbr/pkg/restore/split/client.go
https://github.com/pingcap/tidb/blob/master/br/pkg/restore/split/client.go#L536-L566 also encode the lastKey without check the lastKey is empty slices. and then it call
PaginateScanRegion
, which throw the error.I guess It was some issue like #52574
report error:
4. What is your TiDB version? (Required)
Only Tikv and pd (v5.0.6)
br(v8.4.0-nigthly)
The text was updated successfully, but these errors were encountered: