Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

booting with bcachefs compression on is broken #309388

Closed
codebam opened this issue May 5, 2024 · 22 comments · Fixed by #310504
Closed

booting with bcachefs compression on is broken #309388

codebam opened this issue May 5, 2024 · 22 comments · Fixed by #310504
Labels
0.kind: bug Something is broken 0.kind: regression Something that worked before working no longer 2.status: wait-for-upstream Waiting for upstream fix (or their other action). 9.needs: upstream fix This PR needs upstream to change something

Comments

@codebam
Copy link
Contributor

codebam commented May 5, 2024

Describe the bug

I'm using bcachefs and it fails to boot on 05/05 (latest as of right now). Stage 1 fails. Other generations boot fine.

Steps To Reproduce

Steps to reproduce the behavior:

  1. update nixos

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

 - system: `"x86_64-linux"`
 - host os: `Linux 6.9.0-rc6, NixOS, 24.05 (Uakari), 24.05.20240503.e96601e`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.2`
 - channels(root): `"nixos"`
 - nixpkgs: `/nix/store/3gcgx970vsz0h89kjm3nwixb74b03g7r-source`

Add a 👍 reaction to issues you find important.

@codebam codebam added the 0.kind: bug Something is broken label May 5, 2024
@codebam
Copy link
Contributor Author

codebam commented May 5, 2024

❯ nix store diff-closures /nix/var/nix/profiles/system-72-link /nix/var/nix/profiles/system-7
5-link
bcachefs-tools: 1.4.1 → 1.7.0, -1554.5 KiB
btrfs-progs: 6.8 → 6.8.1
crun: 1.14.4 → 1.15, +14.5 KiB
extra: -903.2 KiB
fwupd: 1.9.18 → 1.9.19
initrd-linux: -345.6 KiB
javascript-grammar: ∅ → 0.0.0+rev=fff4560, +635.9 KiB
lua: ∅ → 5.1.5, +1004.3 KiB
lua5.1-luassert: ∅ → 1.9.0-1, +118.5 KiB
lua5.1-say-scm: ∅ → 1, +10.7 KiB
nix-grammar: ∅ → 0.0.0+rev=b3cda61, +106.2 KiB
nixos-system-nixos: 24.05.20240503.e96601e → 24.05.20240505.9f5a6d7
notmuch: +105.5 KiB
notmuch-vim: ∅ → ε
openfortivpn: 1.21.0 → 1.22.0
ruby3.1-date: ∅ → 3.3.4, +937.5 KiB
ruby3.1-mail: ∅ → 2.8.1, +3731.7 KiB
ruby3.1-mini_mime: ∅ → 1.1.5, +241.1 KiB
ruby3.1-net-imap: ∅ → 0.4.10, +594.4 KiB
ruby3.1-net-pop: ∅ → 0.1.2, +40.7 KiB
ruby3.1-net-protocol: ∅ → 0.2.2, +25.6 KiB
ruby3.1-net-smtp: ∅ → 0.5.0, +55.7 KiB
ruby3.1-timeout: ∅ → 0.4.1, +15.8 KiB
source: +121.4 KiB
typescript-grammar: ∅ → 0.0.0+rev=b00b8eb, +2285.3 KiB
vimplugin-gen.nvim: ∅ → 2024-05-03, +27.1 KiB
vimplugin-lua5.1-plenary.nvim-scm: ∅ → 1-unstable-2024-03-25, +443.6 KiB
vimplugin-nvim-treesitter: ∅ → 2024-04-20, +1860.1 KiB
vimplugin-treesitter-grammar: ∅ → ε
vte: 0.76.0 → 0.76.1
zfs-user: 2.2.3 → 2.2.4, +13.7 KiB

@codebam
Copy link
Contributor Author

codebam commented May 5, 2024

I believe it's an issue with the new bcachefs-tools

@codebam
Copy link
Contributor Author

codebam commented May 5, 2024

@JohnRTitor
Copy link
Contributor

Are you using systemd in initrd? (boot.initrd.systemd.enable)

Bcachefs as root won't boot with that.

Please provide logs so we can help further.

@codebam
Copy link
Contributor Author

codebam commented May 6, 2024

I'm not using boot.initrd.systemd.enable. Where do you want logs from?

@JohnRTitor
Copy link
Contributor

The bcachefs error messages you see on the screen are normal though. You can see them by booting in a "working" generation, and running journalctl -b | grep bcachefs | less.
Also does this happen on 6.8 kernel?

@codebam
Copy link
Contributor Author

codebam commented May 6, 2024

I get the same error message on 6.8.9 about stage 1 failing

@JohnRTitor
Copy link
Contributor

JohnRTitor commented May 6, 2024

I am on 6.8.9 kernel and I have no trouble booting. Could be related to your specific configuration.

First test if it's actually coming from bcachefs-tools, I'll recommend pinning only bcachefs-tools to the previous version like this, and use an overlay to fetch system bcachefs-tools from that nixpkgs commit.

@codebam
Copy link
Contributor Author

codebam commented May 6, 2024

I'm using encryption on bcachefs so that could be it. I'll try using an overlay from that commit and see if it boots or not

@codebam
Copy link
Contributor Author

codebam commented May 6, 2024

I pinned bcachefs-tools to nixpkgs/668834f72c7a082bdb823d1367a9abea15ebfcad and it boots. Using an overlay like you showed.

@JohnRTitor
Copy link
Contributor

Few tests I would like you to run, does it work using the latest commit from this https://github.com/koverstreet/bcachefs-tools

It has a flake, so you can use bcachefs-tools.packages.x86_64-linux.bcachefs after adding the flake in your flake.nix.
If it works, I'd like you to test the v1.7.0 branch.

@codebam
Copy link
Contributor Author

codebam commented May 6, 2024

The latest commit from master doesn't boot. I'll try v1.7.0. Edit: v1.7.0 doesn't boot either

@JohnRTitor
Copy link
Contributor

JohnRTitor commented May 6, 2024

Could you share part of your configuration related to disks and encryption?
Just so we are clear you have been using the latest rc kernel between these tests?

Pinging @koverstreet for his input.

EDIT: Kent, they are using encryption with compression in their configuration, could that be it?

@codebam
Copy link
Contributor Author

codebam commented May 6, 2024

Yes I've been using rc6 for these tests. This is my entire configuration as of right now: https://github.com/codebam/nixos

@JohnRTitor
Copy link
Contributor

JohnRTitor commented May 6, 2024

This doesn't seem to be a packaging problem, and I don't see anything weird per say in your NixOS configuration.

But I'll have to redirect you to their GitHub or r/bcachefs as Kent is pretty active over there on Reddit.

Keeping this issue open for the time being.

@chaosbiber
Copy link

I can confirm this happening on my machine updating today. Sadly lacking time to debug further but gonna watch this and possible upstream issues for when I've got more time.

Updated 6.8.8->6.8.9 kernel, nixos-unstable, also with bcache root fs with enabled encryption.

Possibly relevant configuration:

 boot.kernelPackages = pkgs.linuxPackages_latest;

  fileSystems."/" =
    { device = "UUID=...";
      fsType = "bcachefs";
      options = [ "compression=lz4" ];
    };

@JohnRTitor
Copy link
Contributor

JohnRTitor commented May 6, 2024

Note: Issue title should be updated to like "booting with bcachefs compression on is broken".

EDIT: @codebam mention that you are using both compression and encryption in the issue description, share a code snippet like above from your hardware configuration and add to issue description.

@JohnRTitor JohnRTitor added 2.status: wait-for-upstream Waiting for upstream fix (or their other action). 9.needs: upstream fix This PR needs upstream to change something labels May 6, 2024
@codebam codebam changed the title Boot failure on 05/05 booting with bcachefs compression on is broken May 6, 2024
@JohnRTitor JohnRTitor added the 0.kind: regression Something that worked before working no longer label May 6, 2024
@pimeys
Copy link
Contributor

pimeys commented May 9, 2024

Having the same issue here:

fileSystems."/" = {
  device = "UUID=5f910790-3f93-4e9e-baf4-13b69719dc6a";
  fsType = "bcachefs";
  options = [
    "compression=lz4"
    "fix_errors=yes"
    "nojournal_transaction_names"
    "relatime"
    "discard"
    "background_compression=lz4"
  ];
};

It looks like an issue with bcachefs-tools version 1.6.4 forward. Would it make sense to either downgrade the nixpkgs version to 1.6.3 or mark the package broken?

@JohnRTitor
Copy link
Contributor

JohnRTitor commented May 9, 2024

I am inclined to downgrade the version to 1.6.3 for 24.05 release unless a fix is provided in upstream within the NixOS 24.05 merge window (4th week of May).

Though, this only affects few chunk of users who are using this experimental filesystem, with encryption on.

Those who are facing this bug, please report it to the with logs in koverstreet/bcachefs-tools#261. This will help Kent fix this faster.

@reedriley
Copy link
Contributor

reedriley commented May 9, 2024

@JohnRTitor: What kernel version is 24.05 expected to ship with? I ask because the 1.6.x series of bcachefs-tools should match bcachefs disk format 1.6 (kernel 6.8.x); whereas 1.7.x matches the disk format in 6.9.x.

I know there's been thought put into making the tools cross-compatible so everything should be fine with version mismatches? But at the same time, unless we're planning to take a 6.9 series kernel I'm not sure there's a pressing need to bump the tools to 1.7.x quite yet.

(Also - I'm hitting this issue without compression enabled; so this issue is probably incorrectly named...)

@JohnRTitor
Copy link
Contributor

@JohnRTitor: What kernel version is 24.05 expected to ship with?

koverstreet/bcachefs-tools#261 (comment)

TLDR: 6.9, I'm tracking this, and will downgrade at the end of the merge window for 24.05, if it comes to it.

@sakuumomo
Copy link

Why wasn't this caught by bcachefsEncrypted?

bcachefsEncrypted = makeInstallerTest "bcachefs-encrypted" {
extraInstallerConfig = {
boot.supportedFilesystems = [ "bcachefs" ];
# disable zfs so we can support latest kernel if needed
imports = [ no-zfs-module ];
environment.systemPackages = with pkgs; [ keyutils ];
};
extraConfig = ''
boot.kernelParams = lib.mkAfter [ "console=tty0" ];
'';
enableOCR = true;
postBootCommands = ''
# Enter it wrong once
target.wait_for_text("enter passphrase for ")
target.send_chars("wrong\n")
# Then enter it right.
target.wait_for_text("enter passphrase for ")
target.send_chars("password\n")
'';
createPartitions = ''
installer.succeed(
"flock /dev/vda parted --script /dev/vda -- mklabel msdos"
+ " mkpart primary ext2 1M 100MB" # /boot
+ " mkpart primary linux-swap 100M 1024M" # swap
+ " mkpart primary 1024M -1s", # /
"udevadm settle",
"mkswap /dev/vda2 -L swap",
"swapon -L swap",
"echo password | mkfs.bcachefs -L root --encrypted /dev/vda3",
"echo password | bcachefs unlock -k session /dev/vda3",
"echo password | mount -t bcachefs /dev/vda3 /mnt",
"mkfs.ext3 -L boot /dev/vda1",
"mkdir -p /mnt/boot",
"mount /dev/vda1 /mnt/boot",
)
'';
};

JohnRTitor added a commit to JohnRTitor/nixpkgs that referenced this issue May 22, 2024
Moved temporarily to unstable to fix NixOS#313350

Also vendor the updated patch for NixOS#309388
from koverstreet/bcachefs-tools#263
alyssais pushed a commit that referenced this issue May 26, 2024
Moved temporarily to unstable to fix #313350

Also vendor the updated patch for #309388
from koverstreet/bcachefs-tools#263
github-actions bot pushed a commit that referenced this issue May 26, 2024
Moved temporarily to unstable to fix #313350

Also vendor the updated patch for #309388
from koverstreet/bcachefs-tools#263

(cherry picked from commit 1037866)
Sobte pushed a commit to Sobte/nixpkgs that referenced this issue May 28, 2024
Moved temporarily to unstable to fix NixOS#313350

Also vendor the updated patch for NixOS#309388
from koverstreet/bcachefs-tools#263
@JohnRTitor JohnRTitor moved this to Done in Bcachefs Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 0.kind: regression Something that worked before working no longer 2.status: wait-for-upstream Waiting for upstream fix (or their other action). 9.needs: upstream fix This PR needs upstream to change something
Projects
Status: Done
6 participants