-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPLAT-1733: Update vSphere and include host/vm group #1677
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
@jcpowermac: This pull request references SPLAT-1733 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.18.0" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jcpowermac: This pull request references SPLAT-1733 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jcpowermac: This pull request references SPLAT-1733 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
d61a30f
to
89c3343
Compare
/assign @rvanderp3 @vr4manta |
/cc @gnufied @JoelSpeed |
) | ||
|
||
// VSpherePlatformFailureDomainSpec holds the region and zone failure domain and the vCenter topology of that failure domain. | ||
// +kubebuilder:validation:XValidation:rule="has(self.zoneType) && self.zoneType == 'HostGroup' ? self.topology.affinity.vmGroup != '' && self.topology.affinity.hostGroup != '' && self.topology.affinity.vmHostRule != '' : true",message="when zoneType is HostGroup, failuredomain topology affinity vmGroup, hostGroup and vmHostRule fields must be defined" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Topology does not appear in the spec here, I think it may have been meant to be there though?
This validation apppears that you want an optional struct, with required fields in for the zoneType, is this just a discriminated union? It would then have the standard that hostGroup
is allowed if and only if type == HostGroup
region:
type: ABC
zone:
type: HostGroup
hostGroup:
vmGroup: ...
hostGroup: ...
vmHostRule: ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I broke out what is current (as of 9/9/24) and what is specifically changing for vm-host zonal. Rather than duplicating these are snipits of what would change. Would you rather have that as a single section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally - it makes sense to put new changes into existing structs, rather than breaking them out. This way, we can still see what changed via diff, but we can also see the full datatype if we view complete file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoelSpeed does this look like the direction you are thinking? 677ce07
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or something like this: fdb3dde
Since there is a name collision with region and zone, wondering if it should be moved to the topology struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter of the two commits is where I was thinking to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whew thats what I ended up with
The enhancement has been updated for that
// This configuration within vCenter creates the required association between a failure domain, virtual machines | ||
// and ESXi hosts to create a vm-host based zone. | ||
type VSphereFailureDomainAffinity struct { | ||
// VMGroupName is the name of the vm-host group of type virtual machine within vCenter for this failure domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be vmGroup
at the beginning of this godoc
|
||
// Host defines host VMs to generate as part of the installation. | ||
type Host struct { | ||
// FailureDomain refers to the name of a FailureDomain as described in https://github.com/openshift/enhancements/blob/master/enhancements/installer/vsphere-ipi-zonal.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to explain this in more depth in situ rather than referring out to a doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoelSpeed first time updating a enhancement. This is what is current in the infra installer's platform spec, so are you expecting this to be updated? I suppose that should really be a bug right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't realise this was the current state, we should probably look to update these where they are non-breaking, can you create a bug?
// NetworkDeviceSpec to be applied to the host | ||
// +kubebuilder:validation:Required | ||
NetworkDevice *NetworkDeviceSpec `json:"networkDevice"` | ||
// Role defines the role of the node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please expand the godoc here, what do each of these values mean and what effect does it have when I set them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, this is what is current.
// +kubebuilder:validation:Format=ipv4 | ||
// +kubebuilder:validation:Format=ipv6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// +kubebuilder:validation:Format=ipv4 | |
// +kubebuilder:validation:Format=ipv6 | |
// +kubebuilder:validation:XValidation:rule="isIP(self)",message="gateway must be a valid IPv4 or IPv6 address" |
// +kubebuilder:example=`192.168.1.100/24` | ||
// +kubebuilder:example=`2001:DB8:0000:0000:244:17FF:FEB6:D37D/64` | ||
// +kubebuilder:validation:Required | ||
IPAddrs []string `json:"ipAddrs"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll have to define a maximal length for this
// ipAddrs is a list of one or more IPv4 and/or IPv6 addresses and CIDR to assign to | ||
// this device, for example, 192.168.1.100/24. IP addresses provided via ipAddrs are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which device?
// +kubebuilder:validation:Format=ipv4 | ||
// +kubebuilder:validation:Format=ipv6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// +kubebuilder:validation:Format=ipv4 | |
// +kubebuilder:validation:Format=ipv6 | |
// +kubebuilder:validation:XValidation:rule="isIP(self)",message="gateway must be a valid IPv4 or IPv6 address" |
// +kubebuilder:validation:Format=ipv4 | ||
// +kubebuilder:validation:Format=ipv6 | ||
// +kubebuilder:example=`8.8.8.8` | ||
Nameservers []string `json:"nameservers,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a maximal length to this list as well
} | ||
|
||
|
||
type Topology struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this struct included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the infra spec changes, just showing snipits of what would actually change. The current topology struct is above.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
type VSpherePlatformSpec struct { | ||
// vcenters holds the connection details for services to communicate with vCenter. | ||
// Currently, only a single vCenter is supported. | ||
// Currently, only a single vCenter is supported, but in tech preview 3 vCenters are supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought one of the epics in 4.18 is to remove techpreview status from multi-vcenter support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied and pasted what the current infra spec is as of 9/9 . afaik it is supposed to be ga'ed in 4.18 but maybe those updates have not been added yet.
cc: @vr4manta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to merge the logic in an overall doc, but while in flux, I do see the complication of multi vcenter not being here as well. Currently 4.17 it is feature gated. Once its all finalized, should be GA. Goal for GA is 4.18.
// must use URN-notation instead of display names. A maximum of 10 tag IDs may be specified. | ||
// +kubebuilder:example=`urn:vmomi:InventoryServiceTag:5736bf56-49f5-4667-b38c-b97e09dc9578:GLOBAL` | ||
// +optional | ||
TagIDs []string `json:"tagIDs,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we intend to use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added in 4.16 I believe. It allows a compute machine to be tagged. I think it was specifically for backup processes that customers wanted it for.
// +optional | ||
LoadBalancer *configv1.VSpherePlatformLoadBalancer `json:"loadBalancer,omitempty"` | ||
// Hosts defines network configurations to be applied by the installer. Hosts is available in TechPreview. | ||
Hosts []*Host `json:"hosts,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to support hosts as zones? What is the use case I wonder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure the answer is yes, afaik it for static ip.
@vr4manta @rvanderp3 do you want to take that one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is not reflected in Infrastructure
type above. So information will be lost after cluster installation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I think I misunderstood, Hosts
is for OCP hosts and static ip, that isn't ESXi hosts.
// +openshift:validation:featureGate=VSphereHostVMGroupZonal | ||
// +kubebuilder:validation:MaxLength=80 | ||
// +optional | ||
VMGroup string `json:"vmGroup,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be created if it doesn't exist already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the installer will create the vmGroup, one per failure domain
// +openshift:validation:featureGate=VSphereHostVMGroupZonal | ||
// +kubebuilder:validation:MaxLength=80 | ||
// +optional | ||
HostGroup string `json:"hostGroup,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The installer will not create the hostGroup, that would be kinda hard since we would need to know each esxi host per zone.
// and the vm-host affinity rule that together creates a affinity configuration for vm-host based zonal. | ||
// This configuration within vCenter creates the required association between a failure domain, virtual machines | ||
// and ESXi hosts to create a vm-host based zone. | ||
type VSphereFailureDomainAffinity struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the thing - we will have to see is, if a customer specifies computeCluster
as region and then specifies two different hostGroup
as zones and assuming those hostGroups
don't share the datastore - will CSI driver work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that scenario was just asked in slack the other day so yeah should be a concern.
The enhancement has been updated to include current and future states for both the infrastructure and installer's platform spec. Hopefully that will make it more clear. |
@@ -180,6 +219,57 @@ type VSpherePlatformTopology struct { | |||
// +kubebuilder:validation:Pattern=`^/.*?/vm/.*?` | |||
// +optional | |||
Folder string `json:"folder,omitempty"` | |||
|
|||
// template is the full inventory path of the virtual machine or template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If hosts are being used as zones, how will that information reflected in Infrastructure
type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The infrastructure spec will have the name of the vm-host group of type hosts
We will not know the ESXi hosts directly that are in each zone. I don't think we should either since stretched clusters presumably should already have this defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then, it sounds like customer must create a vm-host group for each host? I must be misunderstanding something. The CSI driver doesn't support hostgroup as a failure-domain fwiw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, and maybe that illustrates something that I am missing.
- I expect the customer to create a vm-host group (host type) for each zone prior to installation. This can be 1 to N number of ESXi hosts.
- Just like dc, cluster, region and zone topology the openshift-region and openshift-zone based tags are required. I don't expect the CCM or CSI to know anything about a virtual machine or host type vm-host group - and based in testing it doesn't, the region and zone labels are still populated on a control plane or compute node.
the vm-host groups and rules are required to keep rhcos virtual machines on the correct ESXi hosts for the failure domain zone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated with 8c3f02c
@jcpowermac At the moment, this PR updates the existing state of the EP to what was actually implemented, and then also updates to add an extension to the feature itself. Is there a way to separate the two changes? I feel like it would probably be beneficial to get the EP updated to the current state of the world first, and then look at the updates for the new features |
This looks good to me from storage side. Storage features work - https://issues.redhat.com/browse/STOR-2035 /lgtm |
New changes are detected. LGTM label has been removed. |
@jcpowermac: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
Stale enhancement proposals rot after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle rotten |
Changes
Update the vSphere installer IPI zonal enhancement to include:
Additional PRs
SPLAT-1742: vSphere - enable host group based zonal installer#8873
SPLAT-1781: api bump for host zonal infra change client-go#294
SPLAT-1780: api bump for host zonal infra change library-go#1782
SPLAT-1799: Add support for vSphere host and vm group based zonal cluster-control-plane-machine-set-operator#325
SPLAT-1800: Add support for vSphere host and vm group based zonal machine-api-operator#1285
SPLAT-1743: vSphere - add host and vm based zonal api#1999
SPLAT-1733: Update vSphere and include host/vm group #1677