Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add Support for FailureDomains #793

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sf1tzp
Copy link

@sf1tzp sf1tzp commented Nov 30, 2022

This change adds FailureDomain fields to M3Cluster and M3Machine types, and adds logic to the M3Machine manager code for selecting a host based on a specified FailureDomain.

Related-To: #402
Signed-off-by: Steven Fitzpatrick [email protected]

@metal3-io-bot metal3-io-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 30, 2022
@metal3-io-bot
Copy link
Contributor

Hi @f1tzpatrick. Thanks for your PR.

I'm waiting for a metal3-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@metal3-io-bot metal3-io-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 30, 2022
@metal3-io-bot metal3-io-bot added the needs-rebase Indicates that a PR cannot be merged because it has merge conflicts with HEAD. label Mar 22, 2023
@sf1tzp sf1tzp force-pushed the failure-domains branch from 7459d15 to 2d8c571 Compare April 3, 2023 18:43
@metal3-io-bot metal3-io-bot removed the needs-rebase Indicates that a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2023
@metal3-io-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@metal3-io-bot metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 2, 2023
@sf1tzp sf1tzp force-pushed the failure-domains branch from 2d8c571 to aec3ed4 Compare July 12, 2023 04:44
@metal3-io-bot
Copy link
Contributor

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

@metal3-io-bot
Copy link
Contributor

@metal3-io-bot: Closed this PR.

In response to this:

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@metal3-io-bot
Copy link
Contributor

Can one of the admins verify this patch?

5 similar comments
@metal3-io-bot
Copy link
Contributor

Can one of the admins verify this patch?

@metal3-io-bot
Copy link
Contributor

Can one of the admins verify this patch?

@metal3-io-bot
Copy link
Contributor

Can one of the admins verify this patch?

@metal3-io-bot
Copy link
Contributor

Can one of the admins verify this patch?

@metal3-io-bot
Copy link
Contributor

Can one of the admins verify this patch?

@Rozzii
Copy link
Member

Rozzii commented Sep 20, 2023

/remove-lifecycle stale

@metal3-io-bot metal3-io-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 20, 2023
@Rozzii Rozzii reopened this Sep 20, 2023
@Rozzii
Copy link
Member

Rozzii commented Sep 20, 2023

I am very sorry folks, this somehow fallen through the cracks and I've forgot to review it, I am not an expert on the topic so @lentzi90 @kashifest please help us.
/ok-to-test

@metal3-io-bot metal3-io-bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 20, 2023
@sf1tzp sf1tzp marked this pull request as ready for review September 27, 2023 00:19
@sf1tzp
Copy link
Author

sf1tzp commented Sep 27, 2023

/retest

@metal3-io-bot metal3-io-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 27, 2023
@sf1tzp sf1tzp force-pushed the failure-domains branch 2 times, most recently from 1760526 to 7c21ccd Compare September 27, 2023 00:25
@metal3-io-bot metal3-io-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 27, 2023
@sf1tzp
Copy link
Author

sf1tzp commented Sep 27, 2023

Sorry, I'll get these passing locally before I push again ^^

@lentzi90 do you know if the unit test in the github action is the same as make unit from the repo? Locally, I keep seeing failures from the Metal3Data reconciler (even on the main branch)

Summarizing 9 Failures:
  [FAIL] Metal3DataTemplate manager Test Reconcile [It] Deletion, Cluster not found
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3datatemplate_controller.go:153
  [FAIL] Metal3DataTemplate manager Test Reconcile [It] Deletion, Cluster not found, error
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3datatemplate_controller.go:153
  [FAIL] Metal3DataTemplate manager Test Reconcile [It] Reconcile normal error
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3datatemplate_controller.go:143
  [FAIL] Metal3DataTemplate manager Test Reconcile [It] Reconcile normal no error
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3datatemplate_controller.go:143
  [FAIL] Metal3Data manager Test Data Reconcile functions Test Reconcile [It] Deletion, Cluster not found
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3data_controller.go:137
  [FAIL] Metal3Data manager Test Data Reconcile functions Test Reconcile [It] Deletion, release requeue
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3data_controller.go:137
  [FAIL] Metal3Data manager Test Data Reconcile functions Test Reconcile [It] Deletion, release error
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3data_controller.go:137
  [FAIL] Metal3Data manager Test Data Reconcile functions Test Reconcile [It] Reconcile normal error
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3data_controller.go:127
  [FAIL] Metal3Data manager Test Data Reconcile functions Test Reconcile [It] Reconcile normal no error
  /Users/steven/oss/cluster-api-provider-metal3/controllers/metal3data_controller.go:127

Even with make unit-verbose I'm not seeing the fuzz testing on my machine.

Any thoughts?

@lentzi90
Copy link
Member

@lentzi90 do you know if the unit test in the github action is the same as make unit from the repo? Locally, I keep seeing failures from the Metal3Data reconciler (even on the main branch)

Odd! The unit test is running hack/unit.sh which should be the same as make unit-cover-verbose. Could it be just that it stops before it gets to the fuzzing because of other errors?

@sf1tzp
Copy link
Author

sf1tzp commented Oct 2, 2023

@lentzi90 sorry to keep you waiting, I want to try this again on my linux box at home. I'll try to get to it this evening!

@lentzi90
Copy link
Member

lentzi90 commented Oct 3, 2023

@lentzi90 sorry to keep you waiting, I want to try this again on my linux box at home. I'll try to get to it this evening!

No worries! It should work pretty well also in GitHub Codespaces if you want to try it that way 🙂

@metal3-io-bot metal3-io-bot added the needs-rebase Indicates that a PR cannot be merged because it has merge conflicts with HEAD. label Oct 5, 2023
@metal3-io-bot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@metal3-io-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@metal3-io-bot metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 29, 2024
@metal3-io-bot
Copy link
Contributor

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

@metal3-io-bot
Copy link
Contributor

@metal3-io-bot: Closed this PR.

In response to this:

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Rozzii Rozzii reopened this Apr 5, 2024
@metal3-io-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mboukhalfa for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Rozzii
Copy link
Member

Rozzii commented Apr 11, 2024

/remove-lifecycle stale
/lifecycle frozen
IMO we still need this improvement but this issue looks to be stuck at the moment

@metal3-io-bot metal3-io-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 11, 2024
@metal3-io-bot
Copy link
Contributor

@Rozzii: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/remove-lifecycle stale
/lifecycle frozen
IMO we still need this improvement but this issue looks to be stuck at the moment

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Rozzii Rozzii added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Apr 11, 2024
@metal3-io-bot
Copy link
Contributor

@sf1tzp: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
gofmt cd0cb80 link true /test gofmt
markdownlint cd0cb80 link true /test markdownlint
build cd0cb80 link true /test build
manifestlint cd0cb80 link true /test manifestlint
gomod cd0cb80 link true /test gomod
unit cd0cb80 link true /test unit
generate cd0cb80 link true /test generate

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Rozzii Rozzii modified the milestone: Other Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-rebase Indicates that a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: CAPM3 on hold / blocked
Development

Successfully merging this pull request may close these issues.

4 participants