-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE-1134: Watch infrastructure and update AWS tags #1148
Conversation
@chiragkyal: This pull request references CFE-1134 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
2063ff8
to
7d38529
Compare
/retest |
ea36409
to
f2e5cf8
Compare
@chiragkyal: This pull request references CFE-1134 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@chiragkyal: This pull request references CFE-1134 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/assign @Miciah |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an E2E test? (I don't know whether an E2E test can update the ResourceTags
in the infrastructure config status.)
@@ -134,6 +136,12 @@ func New(mgr manager.Manager, config Config) (controller.Controller, error) { | |||
if err := c.Watch(source.Kind[client.Object](operatorCache, &configv1.Proxy{}, handler.EnqueueRequestsFromMapFunc(reconciler.ingressConfigToIngressController))); err != nil { | |||
return nil, err | |||
} | |||
// Watch for changes to infrastructure config to update user defined tags | |||
if err := c.Watch(source.Kind[client.Object](operatorCache, &configv1.Infrastructure{}, handler.EnqueueRequestsFromMapFunc(reconciler.ingressConfigToIngressController), | |||
predicate.NewPredicateFuncs(hasName(clusterInfrastructureName)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other watches technically should have this predicate too, and ingressConfigToIngressController
should be renamed. However, adding the predicate to the other watches and renaming the map function should be addressed in a follow-up.
ignoredAnnotations := managedLoadBalancerServiceAnnotations.Union(sets.NewString(awsLBAdditionalResourceTags)) | ||
ignoredAnnotations := managedLoadBalancerServiceAnnotations.Clone() | ||
ignoredAnnotations.Delete(awsLBAdditionalResourceTags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just use managedLoadBalancerServiceAnnotations
now?
ignoredAnnotations := managedLoadBalancerServiceAnnotations.Union(sets.NewString(awsLBAdditionalResourceTags)) | |
ignoredAnnotations := managedLoadBalancerServiceAnnotations.Clone() | |
ignoredAnnotations.Delete(awsLBAdditionalResourceTags) | |
return loadBalancerServiceAnnotationsChanged(current, expected, managedLoadBalancerServiceAnnotations) |
To elaborate on that question, there are two general rules at play here:
- First, the status logic sets
Upgradeable=False
if, and only if, it observes a discrepancy between the "managed" annotations' expected values and the actual values. - Second, by the time the status logic runs, there will not be any discrepancy between the expected (desired) annotation values and the actual annotation values.
And these general rules have exceptions:
- As an exception to the first rule, before this PR,
awsLBAdditionalResourceTags
wasn't "managed", but even so, we setUpgradeable=False
if it had been modified. (This is the logic that you are modifying here.) - As an exception to the second rule, if
shouldRecreateLoadBalancer
indicates that changing an annotation value requires recreating the service, then the desired and actual values can differ when the status logic observes them.
So now that you are making the awsLBAdditionalResourceTags
annotation a managed annotation, don't we still want to set Upgradeable=False
if the annotation value doesn't match the expected value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed explanation on how the status logic works and how it sets Upgradeable=False
, as well as the exception that existed with awsLBAdditionalResourceTags
before this PR. Earlier, I was under the impression that the status logic would still set Upgradeable=False
even if awsLBAdditionalResourceTags
was updated by the controller.
So now that you are making the awsLBAdditionalResourceTags annotation a managed annotation, don't we still want to set Upgradeable=False if the annotation value doesn't match the expected value?
Since awsLBAdditionalResourceTags
will now be managed by the controller, and we still want to set Upgradeable=False
if it’s updated by something other than the ingress controller, it does indeed make sense to use managedLoadBalancerServiceAnnotations
directly in this logic. This way, the status logic will behave consistently for managed annotations when any discrepancy is observed.
I've removed the loadBalancerServiceTagsModified()
function and used loadBalancerServiceAnnotationsChanged()
directly inside loadBalancerServiceIsUpgradeable()
and also added some comments for clearer understanding of the flow.
@chiragkyal: This pull request references CFE-1134 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
I need to try it to see if updating the infrastructure config status is possible through E2E. I used Having said that, do you think we should get QE sign-off for this PR? |
The tests in What I'm wondering is whether the test can update the infrastructure config status without some other controller stomping the changes, and whether there could be other reasons specific to the infrastructures resource or
As a general matter, we should have QE sign-off for this PR. QE might prefer to do pre-merge testing as well. Is day2 tags support being handled by a specific group of QA engineers, or are the QA engineers for each affected component responsible for testing the feature? Cc: @lihongan. |
I just pushed a commit to add an E2E test. It's working fine locally, hope it should work on CI as well.
The QA engineers for each affected component are testing this feature. |
/retest-required |
Did pre-merge test on standalone OCP cluster and it can add new tags key/value pair and update existing tags value, but cannot delete the user added tags. see
@chiragkyal please confirm if that's expected. |
And the tags can be added to new created NLB custom ingresscontroler, but when updating tags in infrastructure, the NLB cannot be updated accordingly.
It looks like bug with NLB? checked the NLB service and we can find the annotation of updated tags, see
|
looks kubernetes/kubernetes#96939 is for NLB fix but closed |
It looks like the tags are getting merged with the existing AWS Resource tags. I think we can only control the @Miciah is there a way we can control this behavior ? |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Miciah The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
test/e2e/operator_test.go
Outdated
} | ||
|
||
t.Log("Updating AWS ResourceTags in the cluster infrastructure config") | ||
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use updateInfrastructureConfigSpecWithRetryOnConflict
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we want to update the status of the Infrastructure config, instead of spec. I've moved the status update logic to a new function updateInfrastructureConfigStatusWithRetryOnConflict
for better understanding.
@@ -1291,6 +1300,79 @@ func TestInternalLoadBalancerGlobalAccessGCP(t *testing.T) { | |||
} | |||
} | |||
|
|||
// TestAWSResourceTagsChanged tests the functionality of updating AWS resource tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the acceptance criteria is:
"any modifications to user-defined tags (platform.AWS.ResourceTags) trigger an update of the load balancer service" - shouldn't we have a couple more tests, like deleting a user-defined tag and adding a user-defined tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like deleting a user-defined tag and adding a user-defined tag?
The test already covers adding a user-defined tag.
However, updating the infra status again to remove certain tag is possible, which will update the annotation as well, but the tag won't be removed from the AWS resource itself, and this is an expected behaviour for cloud-provider-aws. See #1148 (comment) for more details.
I've extended the test to cover this scenario of tag removal and annotation update in the latest changes. Hope it covers the acceptance criteria.
/retest |
test/e2e/operator_test.go
Outdated
// Revert to original status | ||
originalInfraStatus := infraConfig.Status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Revert to original status | |
originalInfraStatus := infraConfig.Status | |
// Save a copy of the original infraConfig.Status, to revert changes before exiting. | |
originalInfraStatus := infraConfig.Status.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed the suggestion.
test/e2e/operator_test.go
Outdated
// Revert to original status | ||
originalInfraStatus := infraConfig.Status | ||
t.Cleanup(func() { | ||
updateInfrastructureConfigStatusWithRetryOnConflict(configClient, func(infra *configv1.Infrastructure) *configv1.Infrastructure { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updateInfrastructureConfigStatusWithRetryOnConflict(configClient, func(infra *configv1.Infrastructure) *configv1.Infrastructure { | |
err := updateInfrastructureConfigStatusWithRetryOnConflict(configClient, func(infra *configv1.Infrastructure) *configv1.Infrastructure { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated
infra.Status = originalInfraStatus | ||
return infra | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this:
}) | |
} | |
if err != nil { | |
t.Logf("Unable to remove changes to the infraConfig, possible corruption of test environment: %v", err) | |
} | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, updated.
test/e2e/operator_test.go
Outdated
// assertLoadBalancerServiceAnnotationWithPollImmediate checks if the specified annotation on the | ||
// LoadBalancer Service of the given IngressController matches the expected value. | ||
func assertLoadBalancerServiceAnnotationWithPollImmediate(t *testing.T, kclient client.Client, ic *operatorv1.IngressController, annotationKey, expectedValue string) { | ||
err := wait.PollImmediate(5*time.Second, 5*time.Minute, func() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't notice this earlier, but since I requested other changes, I'm adding this too.
We have started replacing the use of deprecated wait.PollImmediate
with
wait.PollUntilContextTimeout(context.Background()
, as used in https://github.com/openshift/cluster-ingress-operator/blob/master/test/e2e/operator_test.go#L3189. Please use the updated function here, and any new code that requires polled waiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem and thanks for the suggestion. I have replaced wait.PollImmediate
with wait.PollUntilContextTimeout
to be consistent.
test/e2e/operator_test.go
Outdated
err := wait.PollImmediate(5*time.Second, 5*time.Minute, func() (bool, error) { | ||
service := &corev1.Service{} | ||
if err := kclient.Get(context.Background(), controller.LoadBalancerServiceName(ic), service); err != nil { | ||
t.Logf("failed to get service %s: %v", controller.LoadBalancerServiceName(ic), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.Logf("failed to get service %s: %v", controller.LoadBalancerServiceName(ic), err) | |
t.Logf("failed to get service %s: %v, retrying...", controller.LoadBalancerServiceName(ic), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
test/e2e/operator_test.go
Outdated
return false, nil | ||
} | ||
if actualValue, ok := service.Annotations[annotationKey]; !ok { | ||
t.Logf("load balancer has no %q annotation: %v", annotationKey, service.Annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.Logf("load balancer has no %q annotation: %v", annotationKey, service.Annotations) | |
t.Logf("load balancer has no %q annotation yet: %v, retrying...", annotationKey, service.Annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
return false, nil | ||
} else if actualValue != expectedValue { | ||
t.Logf("expected %s, found %s", expectedValue, actualValue) | ||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect that we don't want to keep trying, after we found an unexpected value. Or would we expect it change after this?
return false, nil | |
return false, fmt.Errorf("expected %s, found %s", expectedValue, actualValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to keep trying here because the annotation value might not get updated immediately after the infra status is updated.
test/e2e/util_test.go
Outdated
// updateInfrastructureStatus updates the Infrastructure status by applying | ||
// the given update function to the current Infrastructure object. | ||
func updateInfrastructureConfigStatusWithRetryOnConflict(configClient *configclientset.Clientset, updateFunc func(*configv1.Infrastructure) *configv1.Infrastructure) error { | ||
retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry.DefaultRetry
is only 10ms https://pkg.go.dev/k8s.io/client-go/util/retry#pkg-variables. Why not use wait.PollUntilContextTimeout(context.Background()
and allow it to loop on conflict for a configured amount of time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can use wait.PollUntilContextTimeout(context.Background()
here as well. I've updated the logic in the latest changes. Thanks!
- Ingress controller now monitors changes to the Infrastructure object, ensuring that modifications to user-defined AWS ResourceTags (platform.AWS.ResourceTags) trigger updates to the load balancer service. - Consider awsLBAdditionalResourceTags annotation as a managed annotation. Signed-off-by: chiragkyal <[email protected]>
@chiragkyal: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/lgtm |
/unhold |
1ee3995
into
openshift:master
[ART PR BUILD NOTIFIER] Distgit: ose-cluster-ingress-operator |
The PR introduces the following changes:
The ingress controller now watches for the
Infrastructure
object changes. This ensures that any modifications to user-defined tags (platform.AWS.ResourceTags
) trigger an update of the load balancer service.Consider the
awsLBAdditionalResourceTags
annotation as a managed annotation. Any changes to user-defined tags in the Infrastructure object will be reflected in this annotation, prompting an update to the load balancer service.Implements: CFE-1134