-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tiering issue, hitting the bucket_alloc_failed tracepoint #768
Comments
bcachefs fs usage output? |
I think I need more context - is copygc making progress? |
Here is fs usage:
I think copygc is stuck, "Pending rebalance work" has been stable for a long time and across reboots. It's also strange that most of the SSD data is showing up as user and not cached. Here is the rebalance_work btree; keys and bfloat-failed appear empty:
|
Or is it just the accounting that's stuck? But I don't think pending_rebalance has been reset the last time I ran fsck. Also I have broken keys in that accounting btree (#756) |
copygc does trigger on the same device:
|
bcachefs:rebalance_extent never triggers
|
I have worked around this by running bcachefs device evacuate and bcachefs device set-state rw again (I needed the filesystem). Tiering would need to self-heal on its own but the immediate issue is gone for me. It should be reproducible again by filling a foreground target with data before adding the background target, although here the imbalance happened organically. fs usage:
|
Here is the tracepoint output, showing issues writing to the promote device
(from perf trace -e 'bcachefs:*')
(perf top -g is confirming this is when trying to cache)
bcachefs show-super:
Counters (this mount):
Counters (since creation):
The text was updated successfully, but these errors were encountered: