Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing qemu images on bcachefs breaks xfs in VM #791

Open
jpf91 opened this issue Nov 30, 2024 · 0 comments
Open

Storing qemu images on bcachefs breaks xfs in VM #791

jpf91 opened this issue Nov 30, 2024 · 0 comments

Comments

@jpf91
Copy link

jpf91 commented Nov 30, 2024

Hi there,

thanks for fixing #717 so quickly. I only recently upgraded my kernel and I have not seen the issue anymore 👍

Now for this bug report, even with latest mainline kernel I can still reproduce my issue with storing VM images on bcachefs. This is probably a fringe usecase, but in the end I guess bcachefs should support this to be a full featured FS.

Summary: The Proxmox Hypervisor currently has no native driver for bcachefs, but it'd still be nice to use the normal QEMU file storage on a bcachefs filesystem. So I tried to set this up in Proxmox and install a Centos VM onto bcachefs storage, but the installation fails. I then tried to get a slightly simpler reproduction case.

VM Host

OS: Proxmox VE 8.2.7 / Debian 12 Bookworm
Kernel: Ubuntu Mainline PPA 6.12.1 (6.12.1-061201-generic)
Bcachefs Tools: v1.13.0 tag build from source
bcachefs show-super:

Device:                                     HGST HDN728080AL
External UUID:                             cca5bc65-fe77-409d-a9fa-465a6e7f4eae
Internal UUID:                             ca668445-d05c-47f8-8b05-92c30245a167
Magic number:                              c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                              0
Label:                                     NAS_DATA
Version:                                   1.13: inode_has_child_snapshots
Version upgrade complete:                  1.13: inode_has_child_snapshots
Oldest version on disk:                    1.4: member_seq
Created:                                   Fri Jul  5 14:09:12 2024
Sequence number:                           128
Time of last write:                        Sat Nov 30 20:34:55 2024
Superblock size:                           7.45 KiB/1.00 MiB
Clean:                                     0
Devices:                                   5
Sections:                                  members_v1,crypt,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                  zstd,journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                           alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                              4.00 KiB
  btree_node_size:                         256 KiB
  errors:                                  continue [fix_safe] panic ro 
  metadata_replicas:                       2
  data_replicas:                           2
  metadata_replicas_required:              1
  data_replicas_required:                  1
  encoded_extent_max:                      64.0 KiB
  metadata_checksum:                       none [crc32c] crc64 xxhash 
  data_checksum:                           none [crc32c] crc64 xxhash 
  compression:                             zstd
  background_compression:                  none
  str_hash:                                crc32c crc64 [siphash] 
  metadata_target:                         none
  foreground_target:                       ssd
  background_target:                       hdd
  promote_target:                          ssd
  erasure_code:                            0
  inodes_32bit:                            1
  shard_inode_numbers:                     1
  inodes_use_key_cache:                    1
  gc_reserve_percent:                      8
  gc_reserve_bytes:                        0 B
  root_reserve_percent:                    0
  wide_macs:                               0
  promote_whole_extents:                   1
  acl:                                     1
  usrquota:                                0
  grpquota:                                0
  prjquota:                                0
  journal_flush_delay:                     1000
  journal_flush_disabled:                  0
  journal_reclaim_delay:                   100
  journal_transaction_names:               1
  allocator_stuck_timeout:                 30
  version_upgrade:                         [compatible] incompatible none 
  nocow:                                   0

members_v2 (size 736):
Device:                                    0
  Label:                                   hdd1 (1)
  UUID:                                    141032c8-2583-4306-b4c1-412696d46be5
  Size:                                    7.28 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 30523541
  Last mount:                              Sat Nov 30 20:27:20 2024
  Last superblock write:                   128
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,user,cached
  Btree allocated bitmap blocksize:        1.00 B
  Btree allocated bitmap:                  0000000000000000000000000000000000000000000000000000000000000000
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    1
  Label:                                   hdd2 (2)
  UUID:                                    d038124b-d4a5-4deb-bdd1-eb423c9189c8
  Size:                                    7.33 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 30758228
  Last mount:                              Sat Nov 30 20:27:20 2024
  Last superblock write:                   128
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,user,cached
  Btree allocated bitmap blocksize:        1.00 B
  Btree allocated bitmap:                  0000000000000000000000000000000000000000000000000000000000000000
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    2
  Label:                                   hdd3 (3)
  UUID:                                    09811319-852f-4ac1-a1a9-8aef619df346
  Size:                                    7.28 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 30523541
  Last mount:                              Sat Nov 30 20:27:20 2024
  Last superblock write:                   128
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,user,cached
  Btree allocated bitmap blocksize:        1.00 B
  Btree allocated bitmap:                  0000000000000000000000000000000000000000000000000000000000000000
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    3
  Label:                                   ssd1 (5)
  UUID:                                    074844ac-70c4-4cd7-a302-fa1946985849
  Size:                                    631 GiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 2582576
  Last mount:                              Sat Nov 30 20:27:20 2024
  Last superblock write:                   128
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user,cached
  Btree allocated bitmap blocksize:        32.0 MiB
  Btree allocated bitmap:                  0000000000000000000000001111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 1
  Freespace initialized:                   1
Device:                                    4
  Label:                                   ssd2 (6)
  UUID:                                    4dd47f69-b955-4de5-b9b9-2a6dc60ca16c
  Size:                                    165 GiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             256 KiB
  First bucket:                            0
  Buckets:                                 674860
  Last mount:                              Sat Nov 30 20:27:20 2024
  Last superblock write:                   128
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree,user,cached
  Btree allocated bitmap blocksize:        8.00 MiB
  Btree allocated bitmap:                  0000000000000000000000111111111111111111111111111111111111111111
  Durability:                              1
  Discard:                                 1
  Freespace initialized:                   1

errors (size 40):
fs_usage_cached_wrong                       1               Mon Oct  7 16:09:57 2024
fs_usage_replicas_wrong                     2               Mon Oct  7 16:09:57 2024

The VM was created in Proxmox using default settings for storage and image. This runs qemu kvm like this:

/usr/bin/kvm -id 103 -name test,debug-threads=on -no-shutdown -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server=on,wait=off -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5 -mon chardev=qmp-event,mode=control -pidfile /var/run/qemu-server/103.pid -daemonize -smbios type=1,uuid=073b59e0-198d-4896-afae-9e1982164f4a -smp 4,sockets=1,cores=4,maxcpus=4 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg -vnc unix:/var/run/qemu-server/103.vnc,password=on -cpu qemu64,+aes,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+pni,+popcnt,+sse4.1,+sse4.2,+ssse3 -m 2048 -object iothread,id=iothread-virtioscsi0 -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5 -device vmgenid,guid=3160b218-4ba2-42e6-bfb7-5ef0e4df3131 -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device VGA,id=vga,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on -iscsi initiator-name=iqn.1993-08.org.debian:01:907ae15e667 -drive file=/var/lib/pve/local-btrfs/template/iso/Fedora-Workstation-Live-x86_64-41-1.4.iso,if=none,id=drive-ide2,media=cdrom,aio=io_uring -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100 -device virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0 -drive file=/mnt/data/services/pve//images/103/vm-103-disk-0.qcow2,if=none,id=drive-scsi0,format=qcow2,cache=none,aio=io_uring,detect-zeroes=on -device scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=101 -netdev type=tap,id=net0,ifname=tap103i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=BC:24:11:74:0B:4F,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=102 -machine type=pc+pve0

VM

OS: Fedora 41 workstation Live CD
Kernel: 6.11.4-301.fc41.x86_64

  1. Format /dev/sda using fdisk and create one partition. This works fine.
  2. Run mkfs.xfs. This fails:
root@localhost-live:~# mkfs.xfs /dev/sda1
meta-data=/dev/sda1              isize=512    agcount=4, agsize=2097024 blks
       =                       sectsz=512   attr=2, projid32bit=1
       =                       crc=1        finobt=1, sparse=1, rmapbt=1
       =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
data     =                       bsize=4096   blocks=8388096, imaxpct=25
       =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=16384, version=2
       =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_sb bno 0x0/0x1, err=121
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on (unknown) bno 0x1fff838/0x2, err=121
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_sb bno 0x0/0x1, err=121
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_agf bno 0x1/0x1, err=121
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_agfl bno 0x3/0x1, err=121
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_agi bno 0x2/0x1, err=121
mkfs.xfs: writing AG headers failed, err=121

After this, the following errors can be found in the VM dmesg:

[  740.241536] sd 2:0:0:0: [sda] tag#212 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[  740.241540] sd 2:0:0:0: [sda] tag#212 Sense Key : Illegal Request [current] 
[  740.241542] sd 2:0:0:0: [sda] tag#212 Add. Sense: Invalid field in cdb
[  740.241544] sd 2:0:0:0: [sda] tag#212 CDB: Write(10) 2a 00 00 00 08 00 00 00 01 00
[  740.241545] critical target error, dev sda, sector 2048 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[  740.242534] sd 2:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[  740.242538] sd 2:0:0:0: [sda] tag#62 Sense Key : Illegal Request [current] 
[  740.242540] sd 2:0:0:0: [sda] tag#62 Add. Sense: Invalid field in cdb
[  740.242542] sd 2:0:0:0: [sda] tag#62 CDB: Write(10) 2a 00 02 00 00 38 00 00 02 00
[  740.242543] critical target error, dev sda, sector 33554488 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[  740.242740] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[  740.242742] sd 2:0:0:0: [sda] tag#0 Sense Key : Illegal Request [current] 
[  740.242752] sd 2:0:0:0: [sda] tag#0 Add. Sense: Invalid field in cdb
[  740.242754] sd 2:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 00 00 08 00 00 00 01 00
[  740.242755] critical target error, dev sda, sector 2048 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[  740.244685] sd 2:0:0:0: [sda] tag#214 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[  740.244687] sd 2:0:0:0: [sda] tag#214 Sense Key : Illegal Request [current] 
[  740.244689] sd 2:0:0:0: [sda] tag#214 Add. Sense: Invalid field in cdb
[  740.244690] sd 2:0:0:0: [sda] tag#214 CDB: Write(10) 2a 00 00 00 08 01 00 00 01 00
[  740.244691] critical target error, dev sda, sector 2049 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[  740.244842] sd 2:0:0:0: [sda] tag#215 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[  740.244843] sd 2:0:0:0: [sda] tag#215 Sense Key : Illegal Request [current] 
[  740.244844] sd 2:0:0:0: [sda] tag#215 Add. Sense: Invalid field in cdb
[  740.244845] sd 2:0:0:0: [sda] tag#215 CDB: Write(10) 2a 00 00 00 08 03 00 00 01 00
[  740.244846] critical target error, dev sda, sector 2051 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[  740.244980] sd 2:0:0:0: [sda] tag#216 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[  740.244981] sd 2:0:0:0: [sda] tag#216 Sense Key : Illegal Request [current] 
[  740.244983] sd 2:0:0:0: [sda] tag#216 Add. Sense: Invalid field in cdb
[  740.244984] sd 2:0:0:0: [sda] tag#216 CDB: Write(10) 2a 00 00 00 08 02 00 00 01 00
[  740.244984] critical target error, dev sda, sector 2050 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2

There are no messages in the VM Host dmesg.

Interestingly, other filesystems seem to work better. Manually creating a ext4 fs, mounting and creating / deleting files worked. So I tried to do a default fedora installation, which uses only btrfs and ext4. This works as well. It installs just fine and the installed OS boots. I did not do any further testing though.

So this seems to be somewhat xfs specific. If there's any additional info that could help to debug this further, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant