Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTL Job keeps running if the TTL is disabled after losing heartbeat #57404

Closed
YangKeao opened this issue Nov 15, 2024 · 1 comment · Fixed by #57452
Closed

TTL Job keeps running if the TTL is disabled after losing heartbeat #57404

YangKeao opened this issue Nov 15, 2024 · 1 comment · Fixed by #57452
Assignees
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@YangKeao
Copy link
Member

YangKeao commented Nov 15, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Assign a TTL Job to a TiDB node.
  2. Restart the TiDB node.
  3. Before a new owner is assigned, stop the TTL job by set global tidb_ttl_job_enable = 'OFF'.

Then this TTL job will keep running.

2. What did you expect to see? (Required)

The TTL job can be cancelled.

3. What did you see instead (Required)

The TTL job status is always running and cannot be cancelled.

4. What is your TiDB version? (Required)

3c70a28

@YangKeao YangKeao added the type/bug The issue is confirmed as a bug. label Nov 15, 2024
@YangKeao
Copy link
Member Author

YangKeao commented Nov 15, 2024

Here is a test case:

func TestDisableTTLAfterLoseHeartbeat(t *testing.T) {
	store, dom := testkit.CreateMockStoreAndDomain(t)
	waitAndStopTTLManager(t, dom)
	tk := testkit.NewTestKit(t, store)

	sessionFactory := sessionFactory(t, store)
	se := sessionFactory()

	tk.MustExec("use test")
	tk.MustExec("CREATE TABLE t (id INT PRIMARY KEY, created_at DATETIME) TTL = created_at + INTERVAL 1 HOUR")
	testTable, err := dom.InfoSchema().TableByName(context.Background(), pmodel.NewCIStr("test"), pmodel.NewCIStr("t"))
	require.NoError(t, err)

	ctx := context.Background()
	m1 := ttlworker.NewJobManager("test-ttl-job-manager-1", nil, store, nil, nil)
	require.NoError(t, m1.InfoSchemaCache().Update(se))
	require.NoError(t, m1.TableStatusCache().Update(ctx, se))

	now := se.Now()
	_, err = m1.LockJob(context.Background(), se, m1.InfoSchemaCache().Tables[testTable.Meta().ID], now, uuid.NewString(), false)
	require.NoError(t, err)
	tk.MustQuery("select current_job_status from mysql.tidb_ttl_table_status").Check(testkit.Rows("running"))

	// lose heartbeat. Simulate the situation that m1 doesn't update the hearbeat for 8 hours.
	now = now.Add(time.Hour * 8)

	// stop the tidb_ttl_job_enable
	tk.MustExec("set global tidb_ttl_job_enable = 'OFF'")
	defer tk.MustExec("set global tidb_ttl_job_enable = 'ON'")

	// reschedule and try to get the job
	m2 := ttlworker.NewJobManager("test-ttl-job-manager-2", nil, store, nil, nil)
	require.NoError(t, m1.InfoSchemaCache().Update(se))
	require.NoError(t, m1.TableStatusCache().Update(ctx, se))
	m2.RescheduleJobs(se, now)

	// the job should have been cancelled
	tk.MustQuery("select current_job_status from mysql.tidb_ttl_table_status").Check(testkit.Rows("<nil>"))
}

@YangKeao YangKeao self-assigned this Nov 15, 2024
@YangKeao YangKeao added sig/sql-infra SIG: SQL Infra severity/major affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. labels Nov 15, 2024
@ti-chi-bot ti-chi-bot bot added may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Nov 15, 2024
@YangKeao YangKeao added affects-6.5 This bug affects the 6.5.x(LTS) versions. and removed may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. severity/major sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant