Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identifying devices #801

Open
andrewbaxter opened this issue Dec 15, 2024 · 5 comments
Open

Identifying devices #801

andrewbaxter opened this issue Dec 15, 2024 · 5 comments

Comments

@andrewbaxter
Copy link

andrewbaxter commented Dec 15, 2024

I'm testing with some loopback devices (2 devices, replicas=2, metadata_replicas_required=2, data_replicas_required=2 (what does the last one do? no documentation in the manpage)). A device failed (losetup -d), and now the array won't mount (probably #703).

This isn't about mounting specifically though, my question is more about finding which disk corresponds to which label, etc. Basically, assuming I could mount, how do I figure out which disk is missing and therefore which disk I need to remove?

I'm mounting via UUID, I don't expect device nodes to be stable, so I don't have a list of device nodes for the mount post-format.

  • bcachefs show-super lists devices, but not the device path (I guess it's only looking at superblock info and not scanning devices). The only identifying information is the label (text label + device index?) and the UUID. AFAICT none of these are externally visible, so the device UUID doesn't appear in /dev/disk/*
  • When mounting the identified disks are listed in dmesg by device node (/dev/loop2) but that doesn't seem useful for removing: it could provide a list of present devices but the device node that bcachefs would need to remove the disk can't be determined (it may have been shadowed during this boot). Similarly, with just the device node I can't connect it to labels/UUIDs in the show-super output so I can't determine the missing labels by elimination either.
  • AFAICT there are no other commands for showing information about a disk, that I could e.g. run on all disks to come up with a mapping

Offhand, is there a guide for expected procedures for recovering from a failure? The procedure is unclear at this point and I see lots of SO posts/threads about recovery but with no clear resolution or answers.

@andrewbaxter
Copy link
Author

andrewbaxter commented Dec 15, 2024

Ah! It looks like the device UUID is available in udev:

ID_FS_UUID_SUB=2ecde329-a682-433d-88a3-68e164347d07
ID_FS_UUID_SUB_ENC=2ecde329-a682-433d-88a3-68e164347d07

It's not linked on my system - is that used udev rules on common distros?

@andrewbaxter
Copy link
Author

andrewbaxter commented Dec 15, 2024

I see it referenced in 69-md-clustered-confirm-device.rules:

PROGRAM="/usr/bin/blkid -o device -t UUID_SUB=$env{DEVICE_UUID}", ENV{.md.newdevice} = "$result"

ENV{.md.newdevice}!="", RUN+="/usr/bin/mdadm --manage $env{DEVNAME} --cluster-confirm $env{RAID_DISK}:$env{.md.newdevice}"
ENV{.md.newdevice}=="", RUN+="/usr/bin/mdadm --manage $env{DEVNAME} --cluster-confirm $env{RAID_DISK}:missing"

Unfortunately that's pretty specific and doesn't link anything.

Edit: I tried

$ cat 61-disk-uuid-sub.rules 
ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_SUB_ENC}=="?*", SYMLINK+="disk/by-uuid-sub/$env{ID_FS_UUID_SUB_ENC}"

and that seemed to work fine. Then you can correlate device nodes and show-super device output, identify the present nodes, and then remove the missing nodes.

@andrewbaxter
Copy link
Author

So I guess in summary:

  1. Is this the expected way of identifying drives?
  2. Could these be proposed as recommended udev rules?

@alexminder
Copy link

You can find relation dev-indx with dev name in sys fs:

# bcachefs show-super /dev/sda |grep '^Device:'
...
Device:                                    0
Device:                                    1
Device:                                    3

# ls -l /sys/fs/bcachefs/647f0af5*/dev-?/block
lrwxrwxrwx 1 root root 0 Dec 19 14:07 /sys/fs/bcachefs/647f0af5-81b2-4497-b829-382730d87b2c/dev-0/block -> ../../../../devices/pci0000:00/0000:00:11.0/ata2/host1/target1:0:0/1:0:0:0/block/sdc
lrwxrwxrwx 1 root root 0 Dec 19 14:07 /sys/fs/bcachefs/647f0af5-81b2-4497-b829-382730d87b2c/dev-1/block -> ../../../../devices/pci0000:00/0000:00:11.0/ata1/host0/target0:1:0/0:1:0:0/block/sda
lrwxrwxrwx 1 root root 0 Dec 19 14:07 /sys/fs/bcachefs/647f0af5-81b2-4497-b829-382730d87b2c/dev-3/block -> ../../../../devices/pci0000:00/0000:00:11.0/ata1/host0/target0:2:0/0:2:0:0/block/sdb

@andrewbaxter
Copy link
Author

Ah cool, and I guess the missing devices wouldn't have entries in sys-fs? That's easier than setting up udev rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants