Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admin check failed with “Error 8223 (HY000): data inconsistency in table” after kill pd leader during adding index #55488

Closed
Lily2025 opened this issue Aug 19, 2024 · 2 comments · Fixed by #55506
Assignees
Labels
affects-8.3 component/ddl This issue is related to DDL of TiDB. severity/critical type/bug The issue is confirmed as a bug.

Comments

@Lily2025
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

1、tidb_enable_dist_task='off'
2、run sysbench
3、add index for one of table
4、kill pd leader during adding index
5、admin check index when adding index finish

operatorLogs:
[2024-08-18 19:11:46] ###### start adding index
ALTER TABLE sbtest1 ADD INDEX index_test_1723979506277(c)
[2024-08-18 19:11:46] ###### wait for ddl job finish
[2024-08-18 19:16:46] ###### ddl job finished
select job_id, db_name, table_name, job_type, create_time, start_time, end_time, state, query from information_schema.ddl_jobs where query = 'ALTER TABLE sbtest1 ADD INDEX index_test_1723979506277(c)'
jobId: 381, job type: add index /* ingest */, state: synced
add index done, it takes: 5m0.343907361s
[2024-08-18 19:16:46] ###### start admin check
admin check index sbtest1 index_test_1723979506277

2. What did you expect to see? (Required)

admin check success

3. What did you see instead (Required)

data inconsistency after kill pd leader during adding index

admin check failed
Error 8223 (HY000): data inconsistency in table: sbtest1, index: index_test_1723979506277, handle: 53654296, index-values:"" != record-values:"handle: 53654296, values: [KindString 29259679273-41443820494-67699624372-21652956621-69679967231-62641758853-72167802605-31092810200-79525733868-93118816928]"

4. What is your TiDB version? (Required)

./tidb-server -V
Release Version: v8.3.0
Edition: Community
Git Commit Hash: 6eba67e
Git Branch: HEAD
UTC Build Time: 2024-08-16 10:02:25
GoVersion: go1.21.10
Race Enabled: false
Check Table Before Drop: false
Store: unistore
2024-08-18T19:11:36.561+0800

@Lily2025 Lily2025 added the type/bug The issue is confirmed as a bug. label Aug 19, 2024
@Lily2025
Copy link
Author

/severity critical

@tangenta
Copy link
Contributor

Introduced by #54149.

2024-08-18 19:13:17
{"pod":"tc-tidb-1","log":"[backfilling.go:461] [\"load table ranges from PD\"] [category=ddl] [physicalTableID=245] [\"start key\"=7480000000000000f55f72800000000332b318] [\"end key\"=7480000000000000f55f72800000000445676800]","container":"tidb","namespace":"endless-ha-test-add-index-tps-7552122-1-530","level":"INFO"}
2024-08-18 19:15:08
{"pod":"tc-tidb-1","log":"[backfilling.go:461] [\"load table ranges from PD\"] [category=ddl] [physicalTableID=245] [\"start key\"=7480000000000000f55f72800000000332b31800] [\"end key\"=7480000000000000f55f72800000000445676800]","container":"tidb","namespace":"endless-ha-test-add-index-tps-7552122-1-530","level":"INFO"}
2024-08-18 19:15:08
{"pod":"tc-tidb-1","log":"[backfilling.go:482] [\"load table ranges from PD done\"] [category=ddl] [physicalTableID=245] [\"range start\"=7480000000000000f55f72800000000332b31800] [\"range end\"=7480000000000000f55f72800000000445676800] [\"range count\"=43]","container":"tidb","namespace":"endless-ha-test-add-index-tps-7552122-1-530","level":"INFO"}

The log shows that the "load table ranges from PD" has failed once (normally there will be a "done" after each load , but not here, and the range start has also changed, indicating that it is not a simple retry).

In the next DDL job retry, the start key changed from 7480000000000000f55f72800000000332b318 to 7480000000000000f55f72800000000332b31800(with 00 appended).

{"pod":"tc-tidb-1","log":"[checkpoint.go:357] [\"resume checkpoint\"] [category=ddl-ingest] [jobID=381] [indexIDs=\"[16]\"] [\"flushed key low watermark\"=7480000000000000f55f72800000000332b318] [\"imported key low watermark\"=7480000000000000f55f72800000000332b318] [\"physical table ID\"=245] [\"previous instance\"=tc-tidb-1.tc-tidb-peer.endless-ha-test-add-index-tps-7552122-1-530.svc:4000:/var/lib/tidb-data] [\"current instance\"=tc-tidb-1.tc-tidb-peer.endless-ha-test-add-index-tps-7552122-1-530.svc:4000:/var/lib/tidb-data] [\"folder is empty\"=false]","container":"tidb","namespace":"endless-ha-test-add-index-tps-7552122-1-530","level":"INFO"}

The checkpoint manager resumes the next key to a wrong value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-8.3 component/ddl This issue is related to DDL of TiDB. severity/critical type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants