Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[6.12] FS fails to mount due to "filesystem marked as clean but have deleted inode" #786

Closed
dominikpaulus opened this issue Nov 25, 2024 · 9 comments

Comments

@dominikpaulus
Copy link

dominikpaulus commented Nov 25, 2024

Since I upgraded to 6.12, my bcachefs (root) filesystem repeatedly fails to mount at system bootup, because of filesystem marked as clean but have deleted inode errors at mount time (see below for details). Note that this is extremely well correlated with the update to 6.12.

I then reboot to a separate system (also with Kernel 6.12), run fsck, umount the FS again, and can then successfully boot into my root-on-bcachefs system. However, if I then use the system for some time and reboot, the same issue happens again and I see the same failures at mount time.

All of this is after a clean shutdown/umount.

The setup is rather simple, nothing too funky - it's a notebook machine with NixOS on bcachefs root (on LVM + dm-crypt). Single-device. I enabled snapper for automatic hourly snapshots in the background (speculative: I think it might be quite likely that this only occurs after snapper created a new snapshot in the background? Is this worth ruling out that potential root cause?)

Below is one instance of this happening. I used my bcachefs filesystem for some time, then rebooted into a different system. I then dumped the journal, tried to mount the FS (it indeed failed to mount), and then did run fsck:

bcachefs list_journal /dev/root/nixos:

starting version 1.13: inode_has_child_snapshots opts=ro,errors=continue,nopromote_whole_extents,degraded,very_degraded,fix_errors=yes,nochanges,norecovery,noexcl,read_only
recovering from clean shutdown, journal seq 3899427
journal read done, replaying entries 3899427-3899427
journal entry     3899427
  version         1037
  last seq        3899427
  flush           1
  written at      0:553:471 (sector 288215)
    btree_keys: 
    btree_root: btree=extents l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 604de214efdff9c3 written 503 min_key POS_MIN durability: 1 ptr: 0:124993:0 gen 6
    btree_root: btree=inodes l=2 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq f4356f0f771000f6 written 352 min_key POS_MIN durability: 1 ptr: 0:105104:0 gen 13
    btree_root: btree=dirents l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 89e240f307cc7077 written 357 min_key POS_MIN durability: 1 ptr: 0:124849:0 gen 8
    btree_root: btree=xattrs l=0 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 9d43522a6e2a7d7b written 264 min_key POS_MIN durability: 1 ptr: 0:51889:0 gen 15
    btree_root: btree=alloc l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq c341b3c8ef6a5ec5 written 229 min_key POS_MIN durability: 1 ptr: 0:125047:0 gen 6
    btree_root: btree=reflink l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 28de2645c4d84ad3 written 189 min_key POS_MIN durability: 1 ptr: 0:114925:0 gen 7
    btree_root: btree=subvolumes l=0 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 50c7593e54cb2379 written 204 min_key POS_MIN durability: 1 ptr: 0:2573:0 gen 0
    btree_root: btree=snapshots l=0 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq e293b0eb71597f3c written 18 min_key POS_MIN durability: 1 ptr: 0:117374:0 gen 9
    btree_root: btree=lru l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 56060b5849e80a30 written 181 min_key POS_MIN durability: 1 ptr: 0:120511:0 gen 5
    btree_root: btree=freespace l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 5693fa7ba8cca1d7 written 102 min_key POS_MIN durability: 1 ptr: 0:116814:0 gen 9
    btree_root: btree=need_discard l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 1d13f70a167cce19 written 370 min_key POS_MIN durability: 1 ptr: 0:102552:0 gen 11
    btree_root: btree=backpointers l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 926c2eb97d1ea31d written 386 min_key POS_MIN durability: 1 ptr: 0:101652:0 gen 14
    btree_root: btree=bucket_gens l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 26f7a15177fd2733 written 301 min_key POS_MIN durability: 1 ptr: 0:114564:0 gen 7
    btree_root: btree=snapshot_trees l=0 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 3ffb292fe1001826 written 26 min_key POS_MIN durability: 1 ptr: 0:2571:0 gen 0
    btree_root: btree=deleted_inodes l=0 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq bf6e0eb311c75f51 written 59 min_key POS_MIN durability: 1 ptr: 0:86681:0 gen 13
    btree_root: btree=logged_ops l=1 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq b1475d5952e610a6 written 273 min_key POS_MIN durability: 1 ptr: 0:120826:0 gen 8
    btree_root: btree=subvolume_children l=0 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq ded962d61463b9 written 195 min_key POS_MIN durability: 1 ptr: 0:20493:0 gen 14
    btree_root: btree=accounting l=2 u64s 11 type btree_ptr_v2 SPOS_MAX len 0 ver 0: seq 2b8c0a4a16e8c9e5 written 48 min_key POS_MIN durability: 1 ptr: 0:99647:0 gen 15
    datetime: Sun Nov 24 16:10:54 2024
    usage: type=key_version v=2592382130330996
    clock: read=1758680135
    clock: write=2280251175

mount /dev/root/nixos /mnt/nix:

[   58.064544] bcachefs (dm-6): starting version 1.13: inode_has_child_snapshots opts=nopromote_whole_extents
[   58.064559] bcachefs (dm-6): recovering from clean shutdown, journal seq 3899427
[   58.093956] bcachefs (dm-6): accounting_read... done
[   61.430026] bcachefs (dm-6): alloc_read... done
[   61.432533] bcachefs (dm-6): stripes_read... done
[   61.432538] bcachefs (dm-6): snapshots_read... done
[   61.439527] bcachefs (dm-6): journal_replay... done
[   61.439532] bcachefs (dm-6): resume_logged_ops... done
[   61.440145] bcachefs (dm-6): delete_dead_inodes...
[   61.442413] filesystem marked as clean but have deleted inode 9934335:4294966745, fixing
[   61.443891] bcachefs (dm-6): bch2_fs_recovery(): error erofs_trans_commit
[   61.443894] bcachefs (dm-6): bch2_fs_start(): error starting filesystem erofs_trans_commit
[   61.808628] bcachefs: bch2_fs_get_tree() error: erofs_trans_commit

mount -o fsck,fix_errors=yes /dev/root/nixos /mnt/nix/:

[  113.017392] bcachefs (dm-6): starting version 1.13: inode_has_child_snapshots opts=nopromote_whole_extents,fsck,fix_errors=yes
[  113.017442] bcachefs (dm-6): recovering from clean shutdown, journal seq 3899427
[  113.065050] bcachefs (dm-6): accounting_read... done
[  116.415316] bcachefs (dm-6): alloc_read... done
[  116.417928] bcachefs (dm-6): stripes_read... done
[  116.417933] bcachefs (dm-6): snapshots_read... done
[  116.418162] bcachefs (dm-6): check_allocations...
[  141.690841] bcachefs (dm-6): going read-write
[  141.701495] bcachefs (dm-6): journal_replay... done
[  141.701502] bcachefs (dm-6): check_alloc_info... done
[  142.732862] bcachefs (dm-6): check_lrus... done
[  142.799431] bcachefs (dm-6): check_btree_backpointers... done
[  145.060140] bcachefs (dm-6): check_backpointers_to_extents... done
[  147.558253] bcachefs (dm-6): check_extents_to_backpointers... done
[  151.289971] bcachefs (dm-6): check_alloc_to_lru_refs... done
[  151.608185] bcachefs (dm-6): check_snapshot_trees... done
[  151.608396] bcachefs (dm-6): check_snapshots... done
[  151.609252] bcachefs (dm-6): check_subvols... done
[  151.610035] bcachefs (dm-6): check_subvol_children... done
[  151.610111] bcachefs (dm-6): delete_dead_snapshots... done
[  151.610112] bcachefs (dm-6): check_inodes... done
[  173.685904] bcachefs (dm-6): check_extents... done
[  180.707828] bcachefs (dm-6): check_indirect_extents... done
[  180.793369] bcachefs (dm-6): check_dirents... done
[  186.711326] bcachefs (dm-6): check_xattrs... done
[  186.712066] bcachefs (dm-6): check_root... done
[  186.712468] bcachefs (dm-6): check_unreachable_inodes... done
[  203.317417] bcachefs (dm-6): check_subvolume_structure... done
[  203.317604] bcachefs (dm-6): check_directory_structure... done
[  223.976540] bcachefs (dm-6): check_nlinks... done
[  257.011359] bcachefs (dm-6): resume_logged_ops... done
[  257.011364] bcachefs (dm-6): delete_dead_inodes...
[  257.011898] filesystem marked as clean but have deleted inode 9934335:4294966745, fixing
[  257.012587] filesystem marked as clean but have deleted inode 9994679:4294966749, fixing
[  257.012989] filesystem marked as clean but have deleted inode 9994679:4294966751, fixing
[  257.013254] filesystem marked as clean but have deleted inode 9994761:4294966743, fixing
[  257.013685] filesystem marked as clean but have deleted inode 9994761:4294966745, fixing
[  257.014346] filesystem marked as clean but have deleted inode 9994761:4294966749, fixing
[  257.014962] filesystem marked as clean but have deleted inode 9994761:4294966751, fixing
[  257.015604] filesystem marked as clean but have deleted inode 9994772:4294966749, fixing
[  257.015695] filesystem marked as clean but have deleted inode 9994772:4294966751, fixing
[  257.015809] filesystem marked as clean but have deleted inode 9995657:4294966747, fixing
[  257.036485] filesystem marked as clean but have deleted inode 9996041:4294966745, fixing
[  257.036489] bcachefs (dm-6): Ratelimiting new instances of previous error
[  257.059846]  done
@nitinkmr333
Copy link

I too faced this error today.
But I have not updated kernel (Debian 12, 6.12-rc6) for a few weeks, so not sure why it happened randomly after reboot. I am not using bcachefs as root filesystem.
Running fsck fixed it so I cannot do tests on it anymore but I was able to dump journal and metadata before running fsck-

sudo bcachefs list_journal /dev/sdb1 > journal.0.log
sudo bcachefs list_journal /dev/sda1 > journal.1.log
sudo bcachefs list_journal /dev/sdc1 > journal.3.log
sudo bcachefs dump /dev/sda1 /dev/sdb1 /dev/sdc1 -o ~/metadata

Link to files- https://drive.google.com/drive/folders/1MI330A27LLChW3hTr8EvELCfFGlX45RV?usp=sharing
Hope it helps.

I don't have full logs after running fsck but have the last few lines-

  bi_subvol=0
  bi_parent_subvol=0
  bi_nocow=0, fixing
 done
check_subvolume_structure... done
check_directory_structure... done
check_nlinks... done
resume_logged_ops... done
delete_dead_inodes...filesystem marked as clean but have deleted inode 116243:4294963293, fixing
deleting unlinked inode 116243:4294963293
filesystem marked as clean but have deleted inode 116258:4294963293, fixing
deleting unlinked inode 116258:4294963293
filesystem marked as clean but have deleted inode 116272:4294963293, fixing
deleting unlinked inode 116272:4294963293
filesystem marked as clean but have deleted inode 116276:4294963293, fixing
deleting unlinked inode 116276:4294963293
filesystem marked as clean but have deleted inode 116717:4294963293, fixing
deleting unlinked inode 116717:4294963293
filesystem marked as clean but have deleted inode 116720:4294963293, fixing
deleting unlinked inode 116720:4294963293
filesystem marked as clean but have deleted inode 116769:4294963293, fixing
deleting unlinked inode 116769:4294963293
filesystem marked as clean but have deleted inode 116772:4294963293, fixing
deleting unlinked inode 116772:4294963293
filesystem marked as clean but have deleted inode 117712:4294963285, fixing
deleting unlinked inode 117712:4294963285
filesystem marked as clean but have deleted inode 536980068:4294963285, fixing
deleting unlinked inode 536980068:4294963285
filesystem marked as clean but have deleted inode 536980077:4294963293, fixing
filesystem marked as clean but have deleted inode 536981746:4294963285, fixing
filesystem marked as clean but have deleted inode 536981749:4294963285, fixing
filesystem marked as clean but have deleted inode 536981821:4294963285, fixing
filesystem marked as clean but have deleted inode 1073848323:4294963293, fixing
filesystem marked as clean but have deleted inode 1073848326:4294963293, fixing
filesystem marked as clean but have deleted inode 1073849045:4294963285, fixing
filesystem marked as clean but have deleted inode 1073849048:4294963285, fixing
filesystem marked as clean but have deleted inode 1073849114:4294963285, fixing
filesystem marked as clean but have deleted inode 1073849117:4294963285, fixing
filesystem marked as clean but have deleted inode 1073849214:4294963285, fixing
filesystem marked as clean but have deleted inode 1610706972:4294963293, fixing
filesystem marked as clean but have deleted inode 1610706973:4294963285, fixing
filesystem marked as clean but have deleted inode 1610707328:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707331:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707455:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707458:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707614:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707617:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707665:4294963293, fixing
filesystem marked as clean but have deleted inode 1610707668:4294963293, fixing
filesystem marked as clean but have deleted inode 1610708168:4294963285, fixing
 done
going read-only
finished waiting for writes to stop
flushing journal and stopping allocators, journal seq 8509788
flushing journal and stopping allocators complete, journal seq 8509794
shutdown complete, journal seq 8509795
marking filesystem clean
done starting filesystem
e18ee4c8-86bd-48f5-8b2f-e5cdaf32dd7c: errors fixed
shutting down
shutdown complete

These seems to be same as the ones posted by @dominikpaulus.

show-super (after running fsck):

❯ sudo bcachefs show-super /dev/sda1
Device:                                     (unknown device)
External UUID:                             e18ee4c8-86bd-48f5-8b2f-e5cdaf32dd7c
Internal UUID:                             18b84341-3a47-4b7c-837a-8fdf390f8056
Magic number:                              c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                              1
Label:                                     BACKUP
Version:                                   1.13: inode_has_child_snapshots
Version upgrade complete:                  1.13: inode_has_child_snapshots
Oldest version on disk:                    1.12: rebalance_work_acct_fix
Created:                                   Tue Sep 17 20:55:11 2024
Sequence number:                           501
Time of last write:                        Thu Nov 28 21:28:01 2024
Superblock size:                           5.76 KiB/1.00 MiB
Clean:                                     0
Devices:                                   3
Sections:                                  members_v1,crypt,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                  lz4,zstd,journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                           alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                              4.00 KiB
  btree_node_size:                         256 KiB
  errors:                                  continue [fix_safe] panic ro
  metadata_replicas:                       2
  data_replicas:                           1
  metadata_replicas_required:              1
  data_replicas_required:                  1
  encoded_extent_max:                      64.0 KiB
  metadata_checksum:                       none [crc32c] crc64 xxhash
  data_checksum:                           none [crc32c] crc64 xxhash
  compression:                             none
  background_compression:                  zstd:15
  str_hash:                                crc32c crc64 [siphash]
  metadata_target:                         none
  foreground_target:                       ssd
  background_target:                       hdd
  promote_target:                          ssd
  erasure_code:                            0
  inodes_32bit:                            1
  shard_inode_numbers:                     1
  inodes_use_key_cache:                    1
  gc_reserve_percent:                      8
  gc_reserve_bytes:                        0 B
  root_reserve_percent:                    0
  wide_macs:                               0
  promote_whole_extents:                   1
  acl:                                     1
  usrquota:                                0
  grpquota:                                0
  prjquota:                                0
  journal_flush_delay:                     1000
  journal_flush_disabled:                  0
  journal_reclaim_delay:                   100
  journal_transaction_names:               1
  allocator_stuck_timeout:                 30
  version_upgrade:                         [compatible] incompatible none
  nocow:                                   0

members_v2 (size 592):
Device:                                    0
  Label:                                   wd_passport (10)
  UUID:                                    562d68a2-f982-4123-8708-86e332a7feb5
  Size:                                    931 GiB
  read errors:                             1
  write errors:                            29
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 3815340
  Last mount:                              Thu Nov 28 21:28:00 2024
  Last superblock write:                   501
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user,cached
  Btree allocated bitmap blocksize:        32.0 MiB
  Btree allocated bitmap:                  0000000000000000000000000001000000000001011001101100000111101111
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    1
  Label:                                   wd_black (8)
  UUID:                                    54913cb0-990d-459d-a03e-0e876aad92d1
  Size:                                    3.64 TiB
  read errors:                             1179417
  write errors:                            402
  checksum errors:                         8
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 15261652
  Last mount:                              Thu Nov 28 21:28:00 2024
  Last superblock write:                   501
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user,cached
  Btree allocated bitmap blocksize:        64.0 MiB
  Btree allocated bitmap:                  0000000000010000010000000000000000000000000000000001110000001011
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    3
  Label:                                   adata (7)
  UUID:                                    858f7f3f-5d6d-48d6-ba36-699f8691d484
  Size:                                    238 GiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 976788
  Last mount:                              Thu Nov 28 21:28:00 2024
  Last superblock write:                   501
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user,cached
  Btree allocated bitmap blocksize:        8.00 MiB
  Btree allocated bitmap:                  0000000000000000010000000000000111111111111111111111111111111111
  Durability:                              1
  Discard:                                 1
  Freespace initialized:                   1

errors (size 104):
journal_entry_replicas_not_marked           1               Sat Oct  5 11:04:04 2024
btree_node_data_missing                     36              Thu Oct 10 17:15:04 2024
snapshot_tree_to_missing_subvol             1               Wed Nov 13 13:19:48 2024
inode_unreachable                           35              Thu Nov 28 21:17:39 2024
deleted_inode_but_clean                     32              Thu Nov 28 21:17:55 2024
accounting_mismatch                         13              Thu Nov 28 20:19:22 2024

(I am fairly certain deleted_inode_but_clean was not present before running fsck)

Filesystem usage (after running fsck):

❯ sudo bcachefs fs usage /mounted_drives/BACKUP -h
Filesystem: e18ee4c8-86bd-48f5-8b2f-e5cdaf32dd7c
Size:                       4.40 TiB
Used:                       4.36 TiB
Online reserved:            1.38 MiB

Data type       Required/total  Durability    Devices
reserved:       1/1                [] 3.03 MiB
btree:          1/2             2             [sdb1 sdc1]         16.5 GiB
btree:          1/2             2             [sda1 sdc1]         33.2 GiB
user:           1/1             1             [sdb1]               877 GiB
user:           1/1             1             [sda1]              3.46 TiB
user:           1/1             1             [sdc1]               439 MiB
cached:         1/1             1             [sdb1]              3.72 GiB
cached:         1/1             1             [sda1]              5.49 GiB
cached:         1/1             1             [sdc1]               201 GiB

Compression:
type              compressed    uncompressed     average extent size
lz4                 2.28 GiB        3.67 GiB                60.2 KiB
zstd                 270 GiB         384 GiB                63.7 KiB
incompressible      4.34 TiB        4.34 TiB                50.7 KiB

Btree usage:
extents:            26.1 GiB
inodes:              395 MiB
dirents:            67.0 MiB
xattrs:              512 KiB
alloc:              5.17 GiB
quotas:              512 KiB
stripes:             512 KiB
reflink:            3.11 GiB
subvolumes:          512 KiB
snapshots:           512 KiB
lru:                88.5 MiB
freespace:          6.50 MiB
need_discard:       1.00 MiB
backpointers:       14.7 GiB
bucket_gens:        30.5 MiB
snapshot_trees:      512 KiB
deleted_inodes:      512 KiB
logged_ops:         1.00 MiB
rebalance_work:     1.50 MiB
subvolume_children:  512 KiB
accounting:         56.0 MiB

Pending rebalance work:
881 MiB

hdd.smr.wd_passport (device 0): sdb1              rw
                                data         buckets    fragmented
  free:                     36.4 GiB          149143
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                    8.24 GiB           33734
  user:                      877 GiB         3606160      3.09 GiB
  cached:                   3.69 GiB           18098       742 MiB
  parity:                        0 B               0
  stripe:                        0 B               0
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  unstriped:                     0 B               0
  capacity:                  931 GiB         3815340

hdd.wd_black (device 1):        sda1              rw
                                data         buckets    fragmented
  free:                      146 GiB          596332
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                    16.6 GiB           67962
  user:                     3.46 TiB        14545907      12.0 GiB
  cached:                   5.45 GiB           43246      5.10 GiB
  parity:                        0 B               0
  stripe:                        0 B               0
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  unstriped:                     0 B               0
  capacity:                 3.64 TiB        15261652

ssd.adata (device 3):           sdc1              rw
                                data         buckets    fragmented
  free:                     9.33 GiB           38227
  sb:                       3.00 MiB              13       252 KiB
  journal:                  1.86 GiB            7631
  btree:                    24.8 GiB          101696
  user:                      439 MiB            1814      14.9 MiB
  cached:                    201 GiB          827399       589 MiB
  parity:                        0 B               0
  stripe:                        0 B               0
  need_gc_gens:                  0 B               0
  need_discard:             2.00 MiB               8
  unstriped:                     0 B               0
  capacity:                  238 GiB          976788

Host device:

Device- Raspberry Pi 5 (8GB)
Kernel- Linux rpl 6.12.0-rc6-v8-16k+ #1 SMP PREEMPT Wed Nov  6 19:27:58 IST 2024 aarch64 GNU/Linux
bcachefs-tools version- 1.13.0

@dominikpaulus
Copy link
Author

@nitinkmr333 Are you using snapshots?

I can confirm that this problem is almost definitely related to snapshot usage. It's not just about read-only snapshots, but about any snapshots.

It seems that this occurs if a snapshot is taken while there are files on the FS that have been unlink()ed, but are still around, as there are open file handles. I was able to reproduce this on a clean/empty filesystem like this:

$ dd if=/dev/zero of=image bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB, 500 MiB) copied, 0.105883 s, 5.0 GB/s
$ mkfs.bcachefs image
External UUID:                             a83f1e1f-2db2-4d94-8cdf-3d4b76f8b57a
Internal UUID:                             717ebc07-2e3f-4967-b840-a48efc4627a1
Magic number:                              c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                              0
Label:                                     (none)
Version:                                   1.13: inode_has_child_snapshots
Version upgrade complete:                  0.0: (unknown version)
Oldest version on disk:                    1.13: inode_has_child_snapshots
Created:                                   Thu Nov 28 19:31:45 2024
Sequence number:                           0
Time of last write:                        Thu Jan  1 01:00:00 1970
Superblock size:                           976 B/1.00 MiB
Clean:                                     0
Devices:                                   1
Sections:                                  members_v1,members_v2
Features:                                  new_siphash,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                           

Options:
  block_size:                              4.00 KiB
  btree_node_size:                         128 KiB
  errors:                                  continue [fix_safe] panic ro 
  metadata_replicas:                       1
  data_replicas:                           1
  metadata_replicas_required:              1
  data_replicas_required:                  1
  encoded_extent_max:                      64.0 KiB
  metadata_checksum:                       none [crc32c] crc64 xxhash 
  data_checksum:                           none [crc32c] crc64 xxhash 
  compression:                             none
  background_compression:                  none
  str_hash:                                crc32c crc64 [siphash] 
  metadata_target:                         none
  foreground_target:                       none
  background_target:                       none
  promote_target:                          none
  erasure_code:                            0
  inodes_32bit:                            1
  shard_inode_numbers:                     1
  inodes_use_key_cache:                    1
  gc_reserve_percent:                      8
  gc_reserve_bytes:                        0 B
  root_reserve_percent:                    0
  wide_macs:                               0
  promote_whole_extents:                   1
  acl:                                     1
  usrquota:                                0
  grpquota:                                0
  prjquota:                                0
  journal_flush_delay:                     1000
  journal_flush_disabled:                  0
  journal_reclaim_delay:                   100
  journal_transaction_names:               1
  allocator_stuck_timeout:                 30
  version_upgrade:                         [compatible] incompatible none 
  nocow:                                   0

members_v2 (size 160):
Device:                                    0
  Label:                                   (none)
  UUID:                                    47087681-66d9-4f1c-bce6-cf7af91f0173
  Size:                                    500 MiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             128 KiB
  First bucket:                            0
  Buckets:                                 4000
  Last mount:                              (never)
  Last superblock write:                   0
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                (none)
  Btree allocated bitmap blocksize:        1.00 B
  Btree allocated bitmap:                  0000000000000000000000000000000000000000000000000000000000000000
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   0
starting version 1.13: inode_has_child_snapshots
initializing new filesystem
going read-write
initializing freespace
shutdown complete, journal seq 9
$

Then, use this little C program:

#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>

int main(int argc, char** argv) {
	if (argc < 0)
		return -1;

	printf("open()\n");
	int fd = open(argv[1], O_CREAT);
	if (fd < 0)
		perror("open");

	printf("unlink");
	if (unlink(argv[1]) < 0)
		perror("unlink");

	printf("close");
	fgetc(stdin);
	close(fd);

	return 0;
}

Mount bcachefs, then run above C:

$ mount image /mnt/bcachefs
$ make foo
cc     foo.c   -o foo
$ ./foo /mnt/bcachefs/foobar
open()
unlink
close

in another shell:

bcachefs subvolume snapshot -r /mnt/bcachefs /mnt/bcachefs/$(date +%H%M%S

now press enter in the first to actually close the FD.

Afterwards,

umount /mnt/bcachefs && mount image /mnt/bcachefs/

will fail due to broken inodes:

[ 2385.315244] bcachefs (loop0): starting version 1.13: inode_has_child_snapshots
[ 2385.315284] bcachefs (loop0): recovering from clean shutdown, journal seq 48
[ 2385.319998] bcachefs (loop0): accounting_read... done
[ 2385.320264] bcachefs (loop0): alloc_read... done
[ 2385.320280] bcachefs (loop0): stripes_read... done
[ 2385.320290] bcachefs (loop0): snapshots_read... done
[ 2385.321669] bcachefs (loop0): journal_replay... done
[ 2385.321683] bcachefs (loop0): resume_logged_ops... done
[ 2385.321694] bcachefs (loop0): delete_dead_inodes...
[ 2385.324129] filesystem marked as clean but have deleted inode 1073741825:4294967287, fixing
[ 2385.326727] bcachefs (loop0): bch2_fs_recovery(): error erofs_trans_commit
[ 2385.326735] bcachefs (loop0): bch2_fs_start(): error starting filesystem erofs_trans_commit
[ 2385.653919] bcachefs: bch2_fs_get_tree() error: erofs_trans_commit

@nitinkmr333
Copy link

nitinkmr333 commented Nov 30, 2024

@dominikpaulus Great find!

Yes, I have set up cron jobs to take a read-only snapshot of all my subvolumes at midnight everyday. I tried testing with your script and got same error on kernel 6.12.1 (Debian).

I also tried this on another system with kernel 6.11.5 (NixOS). I am not getting any errors while mounting (and I am able to mount the image without any errors in dmesg) but I am getting errors while running fsck.

Kernel 6.11-
1st fsck run (after unmount)-

❯ sudo bcachefs fsck -v -y -p /dev/loop0
libkmod: kmod_config_parse: /etc/modprobe.d/nixos.conf line 6: ignoring bad line starting with 'The'
fsck binary is version 1.13: inode_has_child_snapshots but filesystem is 1.12: rebalance_work_acct_fix and kernel is 1.12: rebalance_work_acct_fix, using kernel fsck
Running in-kernel offline fsck
bcachefs (loop0): starting version 1.12: rebalance_work_acct_fix opts=ro,degraded,verbose,fsck,fix_errors=yes,read_only
bcachefs (loop0): recovering from clean shutdown, journal seq 24
bcachefs (loop0): accounting_read... done
bcachefs (loop0): alloc_read... done
bcachefs (loop0): stripes_read... done
bcachefs (loop0): snapshots_read... done
bcachefs (loop0): check_allocations... done
bcachefs (loop0): going read-write
bcachefs (loop0): journal_replay... done
bcachefs (loop0): check_alloc_info... done
bcachefs (loop0): check_lrus... done
bcachefs (loop0): check_btree_backpointers... done
bcachefs (loop0): check_backpointers_to_extents... done
bcachefs (loop0): check_extents_to_backpointers... done
bcachefs (loop0): check_alloc_to_lru_refs... done
bcachefs (loop0): check_snapshot_trees... done
bcachefs (loop0): check_snapshots... done
bcachefs (loop0): check_subvols... done
bcachefs (loop0): check_subvol_children... done
bcachefs (loop0): delete_dead_snapshots... done
bcachefs (loop0): check_inodes... done
bcachefs (loop0): check_extents... done
bcachefs (loop0): check_indirect_extents... done
bcachefs (loop0): check_dirents... done
bcachefs (loop0): check_xattrs... done
bcachefs (loop0): check_root... done
bcachefs (loop0): check_subvolume_structure... done
bcachefs (loop0): check_directory_structure... done
bcachefs (loop0): check_nlinks... done
bcachefs (loop0): resume_logged_ops... done
bcachefs (loop0): delete_dead_inodes...bcachefs (loop0): deleting unlinked inode 536870912:4294967294
bcachefs (loop0): deleting unlinked inode 1207959552:4294967292
 done
bcachefs (loop0): going read-only
bcachefs (loop0): finished waiting for writes to stop
bcachefs (loop0): flushing journal and stopping allocators, journal seq 26
bcachefs (loop0): flushing journal and stopping allocators complete, journal seq 28
bcachefs (loop0): shutdown complete, journal seq 29
bcachefs (loop0): marking filesystem clean
bcachefs (loop0): done starting filesystem
bcachefs (loop0): shutting down
bcachefs (loop0): shutdown complete

2nd fsck run (running again after previous fsck)-

❯ sudo bcachefs fsck -v -y -p /dev/loop0
libkmod: kmod_config_parse: /etc/modprobe.d/nixos.conf line 6: ignoring bad line starting with 'The'
fsck binary is version 1.13: inode_has_child_snapshots but filesystem is 1.12: rebalance_work_acct_fix and kernel is 1.12: rebalance_work_acct_fix, using kernel fsck
Running in-kernel offline fsck
bcachefs (loop0): starting version 1.12: rebalance_work_acct_fix opts=ro,degraded,verbose,fsck,fix_errors=yes,read_only
bcachefs (loop0): recovering from clean shutdown, journal seq 29
bcachefs (loop0): accounting_read... done
bcachefs (loop0): alloc_read... done
bcachefs (loop0): stripes_read... done
bcachefs (loop0): snapshots_read... done
bcachefs (loop0): check_allocations... done
bcachefs (loop0): going read-write
bcachefs (loop0): journal_replay... done
bcachefs (loop0): check_alloc_info... done
bcachefs (loop0): check_lrus... done
bcachefs (loop0): check_btree_backpointers... done
bcachefs (loop0): check_backpointers_to_extents... done
bcachefs (loop0): check_extents_to_backpointers... done
bcachefs (loop0): check_alloc_to_lru_refs... done
bcachefs (loop0): check_snapshot_trees... done
bcachefs (loop0): check_snapshots... done
bcachefs (loop0): check_subvols... done
bcachefs (loop0): check_subvol_children... done
bcachefs (loop0): delete_dead_snapshots... done
bcachefs (loop0): check_inodes... done
bcachefs (loop0): check_extents... done
bcachefs (loop0): check_indirect_extents... done
bcachefs (loop0): check_dirents... done
bcachefs (loop0): check_xattrs... done
bcachefs (loop0): check_root... done
bcachefs (loop0): check_subvolume_structure... done
bcachefs (loop0): check_directory_structure...unreachable inode
u64s 15 type inode_v3 0:536870912:U32_MAX len 0 ver 0: 
  mode=100000
  flags=(4300000)
  journal_seq=26
  bi_size=0
  bi_sectors=0
  bi_version=0
  bi_atime=92734966592
  bi_ctime=92734966592
  bi_mtime=92734966592
  bi_otime=92734966592
  bi_uid=0
  bi_gid=0
  bi_nlink=0
  bi_generation=0
  bi_dev=0
  bi_data_checksum=0
  bi_compression=0
  bi_project=0
  bi_background_compression=0
  bi_data_replicas=0
  bi_promote_target=0
  bi_foreground_target=0
  bi_background_target=0
  bi_erasure_code=0
  bi_fields_set=0
  bi_dir=0
  bi_dir_offset=0
  bi_subvol=0
  bi_parent_subvol=0
  bi_nocow=0, fixing
unreachable inode
u64s 15 type inode_v3 0:1207959552:4294967293 len 0 ver 0: 
  mode=100000
  flags=(4300000)
  journal_seq=26
  bi_size=0
  bi_sectors=0
  bi_version=0
  bi_atime=322467854904
  bi_ctime=322467854904
  bi_mtime=322467854904
  bi_otime=322467854904
  bi_uid=0
  bi_gid=0
  bi_nlink=0
  bi_generation=0
  bi_dev=0
  bi_data_checksum=0
  bi_compression=0
  bi_project=0
  bi_background_compression=0
  bi_data_replicas=0
  bi_promote_target=0
  bi_foreground_target=0
  bi_background_target=0
  bi_erasure_code=0
  bi_fields_set=0
  bi_dir=0
  bi_dir_offset=0
  bi_subvol=0
  bi_parent_subvol=0
  bi_nocow=0, fixing
 done
bcachefs (loop0): check_nlinks... done
bcachefs (loop0): resume_logged_ops... done
bcachefs (loop0): delete_dead_inodes... done
bcachefs (loop0): going read-only
bcachefs (loop0): finished waiting for writes to stop
bcachefs (loop0): flushing journal and stopping allocators, journal seq 32
bcachefs (loop0): flushing journal and stopping allocators complete, journal seq 32
bcachefs (loop0): shutdown complete, journal seq 33
bcachefs (loop0): marking filesystem clean
bcachefs (loop0): done starting filesystem
loop0: errors fixed
bcachefs (loop0): shutting down
bcachefs (loop0): shutdown complete

3rd fsck run (after previous)-

❯ sudo bcachefs fsck -v -y -p /dev/loop0
libkmod: kmod_config_parse: /etc/modprobe.d/nixos.conf line 6: ignoring bad line starting with 'The'
fsck binary is version 1.13: inode_has_child_snapshots but filesystem is 1.12: rebalance_work_acct_fix and kernel is 1.12: rebalance_work_acct_fix, using kernel fsck
Running in-kernel offline fsck
bcachefs (loop0): starting version 1.12: rebalance_work_acct_fix opts=ro,degraded,verbose,fsck,fix_errors=yes,read_only
bcachefs (loop0): recovering from clean shutdown, journal seq 33
bcachefs (loop0): accounting_read... done
bcachefs (loop0): alloc_read... done
bcachefs (loop0): stripes_read... done
bcachefs (loop0): snapshots_read... done
bcachefs (loop0): check_allocations... done
bcachefs (loop0): going read-write
bcachefs (loop0): journal_replay... done
bcachefs (loop0): check_alloc_info... done
bcachefs (loop0): check_lrus... done
bcachefs (loop0): check_btree_backpointers... done
bcachefs (loop0): check_backpointers_to_extents... done
bcachefs (loop0): check_extents_to_backpointers... done
bcachefs (loop0): check_alloc_to_lru_refs... done
bcachefs (loop0): check_snapshot_trees... done
bcachefs (loop0): check_snapshots... done
bcachefs (loop0): check_subvols... done
bcachefs (loop0): check_subvol_children... done
bcachefs (loop0): delete_dead_snapshots... done
bcachefs (loop0): check_inodes... done
bcachefs (loop0): check_extents... done
bcachefs (loop0): check_indirect_extents... done
bcachefs (loop0): check_dirents...bcachefs (loop0): have key for inode 4097:4294967293 but have inode in ancestor snapshot 4294967295
unexpected because we should always update the inode when we update a key in that inode
u64s 8 type dirent 4097:1351207871402680140:4294967293 len 0 ver 0: 1207959552 -> 1207959552 type reg
 done
bcachefs (loop0): check_xattrs... done
bcachefs (loop0): check_root... done
bcachefs (loop0): check_subvolume_structure... done
bcachefs (loop0): check_directory_structure... done
bcachefs (loop0): check_nlinks... done
bcachefs (loop0): resume_logged_ops... done
bcachefs (loop0): delete_dead_inodes... done
bcachefs (loop0): going read-only
bcachefs (loop0): finished waiting for writes to stop
bcachefs (loop0): flushing journal and stopping allocators, journal seq 33
bcachefs (loop0): flushing journal and stopping allocators complete, journal seq 33
bcachefs (loop0): shutdown complete, journal seq 34
bcachefs (loop0): marking filesystem clean
bcachefs (loop0): done starting filesystem
bcachefs (loop0): shutting down
bcachefs (loop0): shutdown complete

Subsequent fsck runs give same error as 3rd one. dmesg does not show any errors.

I think in my case, it is happening because of docker. I recently set up docker volumes (bind mounts) in one of the subvolumes in filesystem. The docker containers have open files inside the FS when snapshot is taken.

I also saw some errors in subvolume's lost+found folder (subvolume where docker bind mounts were present) after running fsck (kernel 6.12). I checked last few days of subvolume snapshots and errors are present (in subvolumes lost+found folders) the next day I set up docker bind mounts.

Unfortunately, I cannot show the exact error (I deleted those subvolumes) but running ls -al command in lost+found was showing some folders and would give No such file or directory error after each folder name. I am unable to fix the filesystem with fsck anymore (after deleting subvolumes with errors). I will post an update again.

@nitinkmr333
Copy link

I created another issue about it- #790

@dominikpaulus, Can you check if lost+found folder in your FS has any files (after running fsck)? In my case, I found there were some files in lost+found folder in one of the subvolumes but running ls -al on it would cause the FS to go read-only. I deleted that subvolume (& some older snapshots with same lost+found issue) but now fsck cannot fix the FS anymore.

@lluchs
Copy link

lluchs commented Dec 7, 2024

I think I have the same issue on my system (also NixOS, kernel 6.12, single-device bcachefs as root, daily snapshots for backups that are immediately deleted again). The system boots with kernel 6.11 and I can indeed see broken files in the lost+found folder:

> sudo ls /lost+found/ -la                                                                                                                                                                                                                           
ls: cannot access '/lost+found/1811947368': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/1677752337': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/1744923619': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/805335263': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/604014774': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/256349': No such file or directory                                                                                                                                                                                                      
ls: cannot access '/lost+found/1744923620': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/604014776': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/671213664': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/876174826': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/268596849': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/1812109423': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/671213665': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/268596851': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/1610639741': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/1409348542': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/469944792': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/738207008': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/2209585': No such file or directory                                                                                                                                                                                                     
ls: cannot access '/lost+found/738322324': No such file or directory                                                                                                                                                                                                   
ls: cannot access '/lost+found/1812109426': No such file or directory                                                                                                                                                                                                  
ls: cannot access '/lost+found/1208130985': No such file or directory
ls: cannot access '/lost+found/201333304': No such file or directory
ls: cannot access '/lost+found/1141114673': No such file or directory
ls: cannot access '/lost+found/1812109427': No such file or directory
total 0
drwx------  2 root root 0 27. Jun 20:28 .
drwxr-xr-x 19 root root 0  6. Dez 23:02 ..
-?????????  ? ?    ?    ?             ? 1141114673
-?????????  ? ?    ?    ?             ? 1208130985
-?????????  ? ?    ?    ?             ? 1409348542
-?????????  ? ?    ?    ?             ? 1610639741
-?????????  ? ?    ?    ?             ? 1677752337
-?????????  ? ?    ?    ?             ? 1744923619
-?????????  ? ?    ?    ?             ? 1744923620
-?????????  ? ?    ?    ?             ? 1811947368
-?????????  ? ?    ?    ?             ? 1812109423
-?????????  ? ?    ?    ?             ? 1812109426
-?????????  ? ?    ?    ?             ? 1812109427
-?????????  ? ?    ?    ?             ? 201333304
-?????????  ? ?    ?    ?             ? 2209585
-?????????  ? ?    ?    ?             ? 256349
-?????????  ? ?    ?    ?             ? 268596849
-?????????  ? ?    ?    ?             ? 268596851
-?????????  ? ?    ?    ?             ? 469944792
-?????????  ? ?    ?    ?             ? 604014774
-?????????  ? ?    ?    ?             ? 604014776
-?????????  ? ?    ?    ?             ? 671213664
-?????????  ? ?    ?    ?             ? 671213665
-?????????  ? ?    ?    ?             ? 738207008
-?????????  ? ?    ?    ?             ? 738322324
-?????????  ? ?    ?    ?             ? 805335263
-?????????  ? ?    ?    ?             ? 876174826

And corresponding dmesg output from that ls:

[ 7078.034462] bcachefs (nvme1n1p3): dirent to missing inode:                                                                                                                                                                                                          
                 u64s 8 type dirent 1610612736:21709322764569511:U32_MAX len 0 ver 0: 1811947368 -> 1811947368 type reg                                                                                                                                                
[ 7078.034470] bcachefs (nvme1n1p3): inconsistency detected - emergency read only at journal seq 3108867                                                                                                                                                               
[ 7078.034566] bcachefs (nvme1n1p3): dirent to missing inode:                                                                                                                                                                                                          
                 u64s 8 type dirent 1610612736:42035942192868925:U32_MAX len 0 ver 0: 1677752337 -> 1677752337 type reg                                                                                                                                                
[ 7078.034595] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:116802478601691249:4294967205 len 0 ver 0: 1744923619 -> 1744923619 type reg        
[ 7078.034620] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:954609162356257106:U32_MAX len 0 ver 0: 805335263 -> 805335263 type reg             
[ 7078.034641] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:1554061651236272046:U32_MAX len 0 ver 0: 604014774 -> 604014774 type reg
[ 7078.034660] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 7 type dirent 1610612736:1567752694174946381:U32_MAX len 0 ver 0: 256349 -> 256349 type reg
[ 7078.034678] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:2025430930275213396:4294967205 len 0 ver 0: 1744923620 -> 1744923620 type reg
[ 7078.034702] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:2054110222763415958:U32_MAX len 0 ver 0: 604014776 -> 604014776 type reg
[ 7078.034721] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:2412794933319148936:4294967205 len 0 ver 0: 671213664 -> 671213664 type reg
[ 7078.034741] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:2572239565891454348:4294967205 len 0 ver 0: 876174826 -> 876174826 type reg
[ 7078.034759] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:2653226844246671626:4294967205 len 0 ver 0: 268596849 -> 268596849 type reg
[ 7078.034778] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:3482407016307601677:4294967205 len 0 ver 0: 1812109423 -> 1812109423 type reg
[ 7078.034796] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:3633701331481355469:4294967205 len 0 ver 0: 671213665 -> 671213665 type reg
[ 7078.034813] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:4366839325661870661:4294967205 len 0 ver 0: 268596851 -> 268596851 type reg
[ 7078.034832] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:4837539340973740053:U32_MAX len 0 ver 0: 1610639741 -> 1610639741 type reg
[ 7078.034851] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:6480188744257040462:4294967205 len 0 ver 0: 1409348542 -> 1409348542 type reg
[ 7078.034870] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:7126977066883694127:4294967205 len 0 ver 0: 469944792 -> 469944792 type reg
[ 7078.034889] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:7139732273141676789:U32_MAX len 0 ver 0: 738207008 -> 738207008 type reg
[ 7078.034908] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 7 type dirent 1610612736:7233509466316788245:4294967205 len 0 ver 0: 2209585 -> 2209585 type reg
[ 7078.034927] bcachefs (nvme1n1p3): dirent to missing inode:                                                                      
                 u64s 8 type dirent 1610612736:7628042569536364882:4294967205 len 0 ver 0: 738322324 -> 738322324 type reg                                                                                                                                             
[ 7078.034945] bcachefs (nvme1n1p3): dirent to missing inode:                                                                                                                                                                                                          
                 u64s 8 type dirent 1610612736:7822682373772565163:4294967205 len 0 ver 0: 1812109426 -> 1812109426 type reg
[ 7078.034966] bcachefs (nvme1n1p3): dirent to missing inode:
                 u64s 8 type dirent 1610612736:8041213979563869918:4294967205 len 0 ver 0: 1208130985 -> 1208130985 type reg
[ 7078.034985] bcachefs (nvme1n1p3): dirent to missing inode:
                 u64s 8 type dirent 1610612736:8278636669720406534:U32_MAX len 0 ver 0: 201333304 -> 201333304 type reg
[ 7078.035006] bcachefs (nvme1n1p3): dirent to missing inode:
                 u64s 8 type dirent 1610612736:8298868916955950037:U32_MAX len 0 ver 0: 1141114673 -> 1141114673 type reg
[ 7078.035024] bcachefs (nvme1n1p3): dirent to missing inode:
                 u64s 8 type dirent 1610612736:8619545732512356501:4294967205 len 0 ver 0: 1812109427 -> 1812109427 type reg
[ 7078.054215] bcachefs (nvme1n1p3): unshutdown complete, journal seq 3108867

@koverstreet
Copy link
Owner

I just reproduced this on 6.12; it turns out this is already fixed in the master branch by

4814218 bcachefs: Use separate rhltable for bch2_inode_or_descendents_is_open()

6.11 had more bugs with snapshots and unlinked files, you'll all definitely want to upgrade to my master branch.

@michaeladler
Copy link

@koverstreet: FYI, I tried the master branch (dd7d7f2), i.e. I applied the patches on top of 6.12.4 but it hangs indefinitely when trying to mount my encrypted bcachefs filesystem. Is it safe to just cherry-pick commit 4814218 on top of the latest 6.12 kernel?

@koverstreet
Copy link
Owner

Yes.

FYI, it's probably not "hung", just taking awhile to upgrade. I'm adding back a progress indicator and I have a bit more performance optimization to do on the upgrade, but it's going to be an expensive one (should be the last expensive forced upgrade, though).

@michaeladler
Copy link

michaeladler commented Dec 11, 2024

Thanks! After waiting 10 minutes, I shut down my computer, and the filesystem still works (with the stock kernel), so the upgrade seems power-cut safe :)

EDIT: Tried again but gave up after 14 hours...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants