Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to improve ceph pool creation reliability #12250

Merged
merged 6 commits into from
Sep 14, 2023

Conversation

simondeziel
Copy link
Member

@simondeziel simondeziel commented Sep 13, 2023

No description provided.

@simondeziel simondeziel force-pushed the ceph-pool branch 3 times, most recently from 99fb30f to 2493696 Compare September 13, 2023 20:32
This is an attempt to improve the reliability of the test
suite that often times out during the ceph pool setup:

> 2023-09-12T17:14:09.3105279Z + timeout --foreground 120 /home/runner/go/bin/lxc storage create lxdtest-gOl ceph volume.size=25MiB ceph.osd.pg_num=16 --verbose
> 2023-09-12T17:16:09.3132429Z + cleanup
g_num=16

Signed-off-by: Simon Deziel <[email protected]>
`mdl` depends on `core18` and the `microceph` on `core22`

Signed-off-by: Simon Deziel <[email protected]>
This avoid saving the 404 XML error if the commit hash is wrong/not available:
> $ cat gotip.tar.gz
> <?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>
> The specified key does not exist.</Message><Details>No such object:
> go-build-snap/go/linux-amd64/<commit-hash>.tar.gz</Details></Error>

As that would later on be feed to `tar`/`gzip`:

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Signed-off-by: Simon Deziel <[email protected]>
@simondeziel simondeziel marked this pull request as ready for review September 14, 2023 00:08
sudo microceph disk add --wipe /dev/sdia
ephemeral_disk="$(findmnt --noheadings --output SOURCE --target /mnt | sed 's/[0-9]\+$//')"
sudo umount /mnt
sudo microceph disk add --wipe "${ephemeral_disk}"
Copy link
Member

@tomponline tomponline Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simondeziel oh did the microceph folks fix the apparmor issue with apparmor that was preventing the use of partitions?

As sometimes (not always) the github runner ephemeral disk isn't a whole disk but a partition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomponline yeah, I too noticed the ephemeral disk was not always setup the same way. Sometimes /mnt was a NTFS partition, some other times it was ext4. Sometimes it would be /dev/sdb1, some other /dev/sda1. I've seen NTFS parts at least 2 times and it's likely not good to back a loop-dev. See /dev/sdb1 in blkid's output:

# From https://github.com/canonical/lxd/actions/runs/6177130891/job/16767605386
# but https://github.com/canonical/lxd/actions/runs/6177049637/job/16767324493 also was NTFS
...
+ swapon -s
Filename				Type		Size		Used		Priority
/mnt/swapfile                           file		4194300		0		-2
+ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
loop0     7:0    0 63.4M  1 loop /snap/core20/1974
loop1     7:1    0 73.9M  1 loop /snap/core22/864
loop2     7:2    0 53.3M  1 loop /snap/snapd/19457
loop3     7:3    0 90.2M  1 loop /snap/microceph/585
sda       8:0    0   86G  0 disk 
├─sda1    8:1    0 85.9G  0 part /
├─sda14   8:14   0    4M  0 part 
+ blkid
+ sudo swapoff /mnt/swapfile
└─sda15   8:15   0  106M  0 part /boot/efi
sdb       8:16   0   14G  0 disk 
└─sdb1    8:17   0   14G  0 part /mnt
/dev/sdb1: LABEL="Temporary Storage" BLOCK_SIZE="512" UUID="24E0FC11E0FBE748" TYPE="ntfs" PARTUUID="f05c5304-01"
/dev/sda15: LABEL_FATBOOT="UEFI" LABEL="UEFI" UUID="D659-5A9F" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="59204cef-8bed-4e09-876e-f620523d8884"
/dev/sda1: LABEL="cloudimg-rootfs" UUID="d4004e88-d528-486d-b994-90c70f9322d4" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="886ad3da-e4ec-442e-b223-03e615c2b78d"
/dev/loop1: TYPE="squashfs"
/dev/loop2: TYPE="squashfs"
/dev/loop0: TYPE="squashfs"
++ findmnt --noheadings --output SOURCE --target /mnt
++ sed 's/[0-9]\+$//'
+ ephemeral_disk=/dev/sdb
+ sudo umount /mnt
+ sudo microceph disk add --wipe /dev/sdb

As for the part vs whole disk, I didn't think of Apparmor but it could well be. My only thought for Apparmor was the immediate benefit of no longer needing to masquerade a loop device as sdia :) That is moot now that we use /dev/sda or /dev/sdb.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets see if it works, previously microceph's apparmor profile prevented using partitions as disks, but hopefully they fixed that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they fixed it:

+ sudo microceph disk add --wipe /dev/sda1
Error: Failed adding new disk: Failed to wipe the device: Failed to run: dd if=/dev/zero of=/dev/disk/by-id/wwn-0x600224800af2a3230d891ce3c7debbb5-part1 bs=4M count=10 status=none: exit status 1 (dd: failed to open '/dev/disk/by-id/wwn-0x600224800af2a3230d891ce3c7debbb5-part1': Permission denied)

@tomponline tomponline merged commit 288dac4 into canonical:main Sep 14, 2023
25 checks passed
@simondeziel simondeziel deleted the ceph-pool branch September 14, 2023 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants