Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: compaction task stuck for hours #10209

Closed
Tracked by #6640
hzxa21 opened this issue Jun 7, 2023 · 3 comments
Closed
Tracked by #6640

bug: compaction task stuck for hours #10209

hzxa21 opened this issue Jun 7, 2023 · 3 comments
Assignees
Labels
type/bug Something isn't working
Milestone

Comments

@hzxa21
Copy link
Collaborator

hzxa21 commented Jun 7, 2023

Describe the bug

Rencently we have seen a small cg3 (MV compaction group) L0->L0 compaction task getting stuck for hours, which cause L0->base level compaction to block for the corresponding compaction group. Some findings:

  • The stuck compaction task contains 60 SSTs with total size=~13MB so it is a relatively small task.
  • Compactor is alive and can handle tasks from both the same and different compaction groups as usual.
  • We saw "Ready to handle compaction task" log but not "Finished compaction task" log. This means compactor indeed received the task but not finished it.
  • The task progress and heartbeat of the stuck task is continously reported to meta node. We use a guard to make sure the task progress is cleared if the task finishes or errors out so that means the task is indeed stuck in the compactor side.
  • Metric "Compacting SSTable Count" reported by meta node stays at 60 for cg3 level0. This means meta indeed didn't receive ReportCompactionTasksRequest from the compactor.

To Reproduce

No response

Expected behavior

No response

Additional context

image
image
image

log.csv

@hzxa21 hzxa21 added the type/bug Something isn't working label Jun 7, 2023
@github-actions github-actions bot added this to the release-0.20 milestone Jun 7, 2023
@hzxa21
Copy link
Collaborator Author

hzxa21 commented Jun 7, 2023

cc @Li0k @Little-Wallace

@Little-Wallace
Copy link
Contributor

We have temporarily fixed this issue by cancel a long task by #10183.
but we have not found why the task hung out.

@Li0k
Copy link
Contributor

Li0k commented Jul 19, 2023

We have temporarily fixed this issue by cancel a long task by #10183. but we have not found why the task hung out.

Might fixed by #10584 ?

@Li0k Li0k closed this as completed Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants