Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate manifests using tooling and configs from osbuild/images #4183

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

achilleas-k
Copy link
Member

This PR:

  1. Adds a new script, ./tools/gen-manifests that clones osbuild/images (at the version specified in go.mod) and runs ./cmd/gen-manifests from there.
  2. Updates the test README to describe the new script.
  3. Removes ./cmd/gen-manifests and the configs in ./tools/test-case-generators.
  4. Updates the existing manifests to the corresponding ones generated in osbuild/images. Some configs are different, but the image type and customizations coverage should be the same.

@achilleas-k achilleas-k requested review from atodorov and thozza June 3, 2024 13:57
@mvo5 mvo5 self-requested a review June 3, 2024 14:00
@achilleas-k achilleas-k force-pushed the images/gen-manifests branch 2 times, most recently from c230e49 to ef8b162 Compare June 3, 2024 14:02
Copy link
Contributor

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, one small inline comment (that has a hard time getting rendered so I will repeat here). Maybe just

go run -tags=exclude_graphdriver_btrfs  github.com/osbuild/images/cmd/gen-manifests@latest

in tools/gen-manifests ? This would avoid the explicit tmpdir handling (or is there another reason for the clone)?

tools/gen-manifests Outdated Show resolved Hide resolved
thozza
thozza previously approved these changes Jun 3, 2024
Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

bcl
bcl previously approved these changes Jun 3, 2024
Copy link
Contributor

@bcl bcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Contributor

@atodorov atodorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image Tests are failing and they aren't failing on the latest nightly pipeline so it looks like the failure is related to this PR.

@achilleas-k achilleas-k force-pushed the images/gen-manifests branch from ef8b162 to 4e7c179 Compare June 4, 2024 12:53
@achilleas-k
Copy link
Member Author

Did a rebase to drop those CS8 unit tests and I'll have a closer look in a bit.

@achilleas-k
Copy link
Member Author

achilleas-k commented Jun 5, 2024

Nightly image tests are failing with:

stdout:
{
  "type": "https://osbuild.org/validation-error",
  "title": "JSON Schema validation failed",
  "success": false,
  "errors": [
    {
      "message": "[{'key': 'NM_CLOUD_SETUP_AZURE', 'value': 'yes'}] is not of type 'string'",
      "path": [
        "pipelines",
        1,
        "stages",
        14,
        "options",
        "config",
        "Service",
        "Environment"
      ]
    }
  ]
}
stderr:
    main_test.go:584: 
        	Error Trace:	/builddir/build/BUILD/osbuild-composer-4e7c179f8d72ce1b26414d0e912b88654ca1f7b3/_build/src/github.com/osbuild/osbuild-composer/cmd/osbuild-image-tests/main_test.go:584
        	            				/builddir/build/BUILD/osbuild-composer-4e7c179f8d72ce1b26414d0e912b88654ca1f7b3/_build/src/github.com/osbuild/osbuild-composer/cmd/osbuild-image-tests/main_test.go:639
        	Error:      	Received unexpected error:
        	            	running osbuild failed: exit status 2
        	Test:       	TestImages/rhel_9.4-aarch64-azure_rhui-empty.json
--- FAIL: TestImages (0.60s)
    --- FAIL: TestImages/rhel_9.4-aarch64-azure_rhui-empty.json (0.60s)
FAIL

This new format for the Environment option was introduced in osbuild v115 (osbuild/osbuild#1684), and RHEL 9.4 has osbuild v110 so the nightly tests wont work with newer manifests.

Perhaps we need to be smarter about this and generate manifests for the nightly pipelines with the version of the gen-manifests tool that matches the version of osbuild-composer that we're testing, otherwise we'll be running into this issue all the time.

It's a bit strange that we're running test data from main (or branch HEADs) against test code from older versions, isn't it? That doesn't seem like a stable test setup and I'm not sure if what it's verifying (that new test data works with osbuild-composer from RHEL 9.4) is useful.

@atodorov
Copy link
Contributor

atodorov commented Jun 5, 2024

@achilleas-k Question:

rename from test/data/manifests/rhel_9.4-aarch64-azure_rhui-boot.json
rename to test/data/manifests/rhel_9.4-aarch64-azure_rhui-empty.json

So files are renamed from their -boot suffix to -empty. What does that mean and why is it different, why is it necessary ? Also does it mean that osbuild-image-tests will not attempt to boot images created from these manifests ?

@atodorov
Copy link
Contributor

atodorov commented Jun 5, 2024

Perhaps we need to be smarter about this and generate manifests for the nightly pipelines with the version of the gen-manifests tool that matches the version of osbuild-composer that we're testing, otherwise we'll be running into this issue all the time.

So basically vendor-in the version of gen-manifests, which is what we're doing RN. It is hard to know which particular version to use b/c downstream is a moving target and changes over the course of a release.

I think so far the way we've dealth with that fact is because manifests are updated very rarely, usually once per minor release. Or maybe because versions have been fairly compatible till recently.

For the purposes of getting this PR out of the door you can use a gen-manifests version that matches 9.4 (or 9.5) but we'll probably have to merge with failing CI jobs until the rest of the 9.5 work is ready. (or alternatively regenerate the manifesta once more for 9.5). Up to you IMO.

It's a bit strange that we're running test data from main (or branch HEADs) against test code from older versions, isn't it? That doesn't seem like a stable test setup and I'm not sure if what it's verifying (that new test data works with osbuild-composer from RHEL 9.4) is useful.

Maybe "a bit" but the setup has been rather stable over time. This question comes up every time when a similar situation occurs so to give you an answer why it is like that:

  1. The entire test suite uses version conditions to skip different steps which aren't relevant to older versions. Whenever the downstream version catches up what was being skipped before will automatically become exercised in nightly CI tests. Whenever such "catch-up" occurs downstream into RHEL we expect no surprises and as-long as the conditionals are done properly there would be none!

  2. To avoid the situation where upstream changes such as this one introduce a downstream incompatibility we execute the 9.4 nightly pipeline on pull requests too which caught the current issue.

  3. Doing it this way allows us to patch & improve the test suite whenever necessary without waiting for approvals on downstream tickets and without having to work with downstream branches. Makes the overall experience from the QE POV easier with less friction.

  4. Allows us to have 99% identical test suite b/w upstream and downstream (with the exception of version conditions) for most of the time. Image tests is an outlier here b/c it's been disabled from the upstream pipeline due to the existing duplications and also b/c it is more relevant as a downstream test job.

@achilleas-k
Copy link
Member Author

@achilleas-k Question:

rename from test/data/manifests/rhel_9.4-aarch64-azure_rhui-boot.json rename to test/data/manifests/rhel_9.4-aarch64-azure_rhui-empty.json

So files are renamed from their -boot suffix to -empty. What does that mean and why is it different, why is it necessary?

The manifest generation in osbuild/images adds the config name to the filename and drops the -boot suffix. Before, we had config names that were usually the image type name with some extra words to differentiate if there were two for an image type, like fedora_39-x86_64-qcow2-boot.json and fedora_39-x86_64-qcow2_customize-boot.json. The new gen-manifests instead adds a configuration name to each filename so it's always {distro}-{arch}-{image_type}-{config_name}.json. empty is just an empty blueprint so it's just the base image without customizations.

Also does it mean that osbuild-image-tests will not attempt to boot images created from these manifests ?

The boot suffix has, for a long time, been added to every manifest filename. I think the purpose of the suffix got lost a long time ago. Seems like as far back as Sep. 2021 (008dfcc), the only value for test_type has been boot (that commit removes the call to generate_test_case() that specifiescustomize instead of boot), so it's basically meaningless now.

@achilleas-k
Copy link
Member Author

achilleas-k commented Jun 5, 2024

For the purposes of getting this PR out of the door you can use a gen-manifests version that matches 9.4 (or 9.5) but we'll probably have to merge with failing CI jobs until the rest of the 9.5 work is ready. (or alternatively regenerate the manifesta once more for 9.5). Up to you IMO.

I see a few options here:

  1. Generate manifests specifically for the nightly pipeline. Checkout osbuild/images for the version that's compatible with osbuild-composer in RHEL 9.4 and put manifests in test/data/manifests/nightly (or replace the existing ones?).
  2. Generate manifests in the pipeline. In other words, we can generate the manifests dynamically every time the pipeline is run and (like with option 1) use a version of osbuild/images that matches the version of osbuild-composer that we're testing.
  3. Checkout the manifests from the osbuild-composer repo at the version that matches the osbuild-composer version in the nightly pipeline (for nightly only, so we still have the main tests with new pipelines).

1 has the benefit that we can clearly differentiate manifests for the main and the nightly pipelines, but it adds a tonne of large files to the repo.
2 is a nice cleanup I think. It adds a bit of work to the pipeline (generating the manifests can take a few mins) but it removes the chore of updating manifests from PR authors.
3 saves us from adding all the extra bits in 1.

@bcl bcl self-requested a review June 5, 2024 18:30
@thozza
Copy link
Member

thozza commented Jun 6, 2024

To be honest, testing on 9.4 nightly has IMHO zero value. 9.4 is already GA and new upstream releases are not landing in 9.4 any more. There is very little value in making main tests and manifests to run on 9.4.

@achilleas-k
Copy link
Member Author

To be honest, testing on 9.4 nightly has IMHO zero value. 9.4 is already GA and new upstream releases are not landing in 9.4 any more. There is very little value in making main tests and manifests to run on 9.4.

Yeah, I was also thinking this but assumed someone thought there was some value and I was missing something.

@atodorov
Copy link
Contributor

atodorov commented Jun 7, 2024

  1. Generate manifests specifically for the nightly pipeline. Checkout osbuild/images for the version that's compatible with osbuild-composer in RHEL 9.4 and put manifests in test/data/manifests/nightly (or replace the existing ones?).

That should be 9.5 nightly after a rebase so it should be easier IMO.

Copy link

github-actions bot commented Jul 8, 2024

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 8, 2024
@achilleas-k achilleas-k removed the Stale label Jul 8, 2024
Copy link

github-actions bot commented Aug 8, 2024

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 8, 2024
@atodorov
Copy link
Contributor

atodorov commented Aug 8, 2024

This PR needs a rebase and there are lots of conflicts to be resolved.

@thozza thozza removed the Stale label Aug 8, 2024
A new script to replace ./cmd/gen-manifests.  The script queries the go
dependencies to get the version of osbuild/images that osbuild-composer
currently depends on, clones the osbuild/images repository at that
version, and uses the ./cmd/gen-manifests tool from there to generate
new manifests.

This is meant to replace ./cmd/gen-manifests so that we only have one
command and set of configurations for generating manifests, to avoid
drift between the two implementations.
The manifests generated by osbuild/images use the property name
'build-request' instead of 'compose-request'.
Support arguments for filtering and pass them straight through to
gen-manifests (multiple values and globs are supported).

`set -x` before running the gen-manifests tool to print the full
command.
This is required for building the binary on RHEL.  It has no effect on
manifest generation, so we can use it unconditionally.
Update the test README with information about the new script.
Remove the format-request-map and repos that were used for the old
manifest generation.  The new manifest generator uses the configs and
repos defined in osbuild/images.
Remove the gen-manifests go cmd since we now only rely on the one in
osbuild/images.
Updated only manifests for image types that were already in the
repository.

This commit also adds Fedora 40 manifests.

Generated with:

    rm ./test/data/manifests/fedora*
    ./tools/gen-manifests 'fedora-*' 'x86_64,aarch64' 'iot-commit,iot-container,iot-installer,iot-raw-image,container,live-installer,minimal-raw,oci,openstack,qcow2,vmdk'
Updated only manifests for image types that were already in the
repository.

This commit also adds CentOS 10 manifests.

Generated with:

    rm ./test/data/manifests/centos*
    ./tools/gen-manifests 'centos-*' '*' 'minimal_raw,openstack,qcow2,tar,oci,vmdk'
Updated only manifests for image types that were already in the
repository.

The commit also adds RHEL 10 manifests and removes manifests for distros
without minor version, rhel-7, rhel-8, and rhel-9.

Generated with:

    rm ./test/data/manifests/rhel*
    ./tools/gen-manifests 'rhel-*' '*' 'azure-rhui,gce-rhui,minimal-raw,openstack,qcow2,tar,vmdk'
@achilleas-k achilleas-k dismissed stale reviews from thozza and bcl via 5eaffcb August 8, 2024 11:39
@achilleas-k achilleas-k force-pushed the images/gen-manifests branch from 4e7c179 to 5eaffcb Compare August 8, 2024 11:39
@achilleas-k
Copy link
Member Author

Rebased and resolved conflicts. I want to keep this open but I'm switching it to draft. We should get back to it when we have more time (after ITM 26) to sort out the connection with manifest-db as well.

@achilleas-k achilleas-k marked this pull request as draft August 8, 2024 11:39
Copy link

github-actions bot commented Sep 8, 2024

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Sep 8, 2024
@achilleas-k achilleas-k removed the Stale label Sep 9, 2024
Copy link

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

Copy link

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 10, 2024
Copy link

This PR was closed because it has been stalled for 30+7 days with no activity.

@github-actions github-actions bot closed this Nov 17, 2024
@thozza thozza removed the Stale label Nov 18, 2024
@thozza thozza reopened this Nov 18, 2024
Copy link

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants