RFC: Kubernetes Patch Support #2011

lblackstone · 2022-06-09T23:00:52Z

This document proposes a solution for managing shared Kubernetes (k8s) resources, also known as “patching.” Users are
welcome to respond with any comments or questions directly on this issue.

Summary

Kubernetes resources commonly have more than one controller making changes to them. These controllers can include
kubectl, the k8s control plane, custom operators, or infrastructure as code (IaC) tools like Pulumi. This presents
particular challenges for tools that manage state independently of k8s, and need to compute diffs based on this state.

Our k8s provider currently manages resources using Client-Side Apply (CSA), which is supported by all versions of k8s.
CSA works by including an annotation on k8s resources that encodes the last applied configuration of the resource. This
approach has some critical limitations:

CSA does not account for multiple controllers; the last-applied-configuration annotation is set by whichever
controller made the latest update
Some controllers do not set the last-applied-configuration annotation, which can lead to other controllers
inadvertently reverting changes

A newer management method called Server-Side Apply (SSA) is available starting in k8s v1.18 (March 2020). SSA adds a new section called managedFields to all k8s resources with information about which controller has set each resource field. This allows multiple controllers to independently manage the same resource without accidentally overwriting the same fields. This functionality can be used to patch and manage shared resources safely. Using SSA introduces some additional complexity to the resource lifecycle that needs to be understood and configurable by the user to avoid unexpected changes to shared resources.

Motivation

Some cloud providers provision k8s resources as part of their managed k8s offerings. Platform teams often want to
update these default configurations, but attempting to update an existing resource will return an error in the current
Pulumi model. Resolving such a conflict currently requires a separate import step prior to making changes, and cannot
be accomplished in a single pulumi up operation. The following are some examples where patch behavior would be
preferable:

Change metadata of an existing Namespace (labels or annotations)
Update data in a shared ConfigMap
Change configuration of a CustomResource provisioned by an external controller
Change the number of Pod replicas in a shared Deployment
Ensure that a ConfigMap exists with a specified configuration
Ensure that a StorageClass exists with a specified configuration

Proposal

The following changes will be made to configure the lifecycle of shared resources using Patch support.

SSA can be enabled by setting a Provider option. This option will be required to use Patch support. SSA support will
be disabled by default until the next major release of pulumi-kubernetes.
Each SSA-enabled Provider will set a unique manager ID per resource to enable unambiguous management of the same resource from multiple controllers. The manager ID can be set explicitly as a patchOption.
Each patch will be represented in Pulumi SDKs as new resource classes corresponding to each k8s resource kind. Patch
resources will be named as <Resource>Patch, and will live in the same namespaces as the corresponding resource. For
example, apps.v1.Deployment will correspond to apps.v1.DeploymentPatch.
The resource name will be the name of the k8s resource to patch, and will be in the form [namespace/]name. For
example, a ConfigMap named app-config in the app-ns namespace, will be referenced as app-ns/app-config.
Unlike normal resource classes, every argument in a Patch resource will be optional except for .metadata.name. This allows users to specify only the parts of the configuration that they want to patch.
Patch classes will include an additional configuration argument to specify patch-specific behavior. Edit: these options will be specified using annotations for the initial release.
1. force - boolean option to indicate that the Pulumi configuration will override any conflicting configuration for
  shared resources; defaults to false
2. manager - string option to set the name of the manager for the SSA operation; will be automatically set to a unique
  value per resource if not provided
The retainOnDelete resource option will be true by default, but can be overridden by explicitly setting it to
false. If retainOnDelete is false, then the shared resource will be deleted from the cluster when the stack is
destroyed. When a Patch resource is destroyed, it will relinquish ownership of any fields that it manages. Any field that becomes unmanaged will reset to its default value.
Auto-naming isn't supported for Patch resources, so the .metadata.name field is required.
Users can explicitly transfer ownership of managed fields by setting the manager patchOption and running an update. These changes can be persisted across a pulumi destroy operation by setting the retainOnDelete option to true.
A new annotation, pulumi.com/patchForce, will be supported on existing resource classes. This annotation indicates that the provided resource definition will override existing resource configuration in case of conflict.

This pseudocode example shows how a Patch resource will be structured in each SDK.

new k8s.core.v1.NamespacePatch(
  name="patch-example", // [required] resource name
  args={ // [required] typed resource args to patch existing resource
    metadata: {
        annotations: {
            "pulumi.com/patchForce": "true", // patch-specific arg for conflict resolution
            "pulumi.com/patchManager": "example", // patch-specific arg for field manager name
        },
        name: "kube-public", // .metadata.name is required -- all other fields are optional
    },
  }, 
  resourceOptions={ ... }, // [optional] standard resource options
);

Flowcharts

The following flowcharts show the expected behavior with the SSA Provider option enabled. The normal resource classes can be used for "upsert" workflows, which will create the resource if it does not exist, or update it if it does. The Patch resource classes can be used to manage individual fields of an existing resource.

Upsert behavior

The pulumi.com/patchForce annotation can be used to automatically resolve conflicts if there is an existing resource with the same name.

Patch behavior

SDK Examples

Change metadata of an existing Namespace

new k8s.core.v1.NamespacePatch("kube-public-patch", {
    metadata: {
        annotations: {
            team: "acmecorp",
        },
        labels: {
            foo: "bar",
        },
        name: "kube-public",
    },
});

name: namespace-metadata
runtime: yaml
resources:
   kube-public-patch:
      type: kubernetes:core/v1:NamespacePatch
      properties:
         metadata:
            name: kube-public
            annotations:
               team: "acmecorp"
            labels:
               foo: bar

Update data in a shared ConfigMap

new k8s.core.v1.ConfigMapPatch("cm-patch", {
    metadata: {
        name: "app-config",
        namespace: "app-ns",
    },
    data: {
        foo: "bar",
    },
});

name: configmap-data
runtime: yaml
resources:
   cm-patch:
      type: kubernetes:core/v1:ConfigMapPatch
      properties:
         metadata:
            name: app-config
            namespace: app-ns
         data:
            foo: bar

Change configuration of a CustomResource provisioned by an external controller

new k8s.apiextensions.CustomResourcePatch("oidc", {
    apiVersion: "authentication.gke.io/v2alpha1",
    kind: "ClientConfig",
    metadata: {
        name: "clientconfig",
        namespace: "kube-public",
    },
    spec: {
        authentication: {
            oidc: {
                clientID: "example",
            },
        },
    },
});

Change the number of Pod replicas in a shared Deployment

new k8s.apps.v1.DeploymentPatch("nginx-replicas", {
    metadata: {
        annotations: {
            "pulumi.com/patchForce": "true",
        },
        name: "nginx",
    },
    spec: {
        replicas: 3,
    },
});

name: replicas
runtime: yaml
resources:
   nginx-replicas:
      type: kubernetes:apps/v1:DeploymentPatch
      properties:
         metadata:
            annotations:
               pulumi.com/patchForce: "true"
            name: nginx
         spec:
            replicas: 3

Ensure that a ConfigMap exists with a specified configuration

new k8s.core.v1.ConfigMap("upsert-app-config", {
   metadata: {
      annotations: {
          "pulumi.com/patchForce": "true",
      },
      name: "app-config",
      namespace: "app-ns",
   },
   data: {
      foo: "bar"
   },
});

name: configmaps
runtime: yaml
resources:
   upsert-app-config:
      type: kubernetes:core/v1:ConfigMap
      properties:
         metadata:
            name: app-config
            namespace: app-ns
            annotations:
               pulumi.com/patchForce: "true"
         data:
            foo: bar

Ensure that a StorageClass exists with a specified configuration

new k8s.storage.v1.StorageClass("gp2-storage-class", {
   metadata: {
      annotations: {
         "pulumi.com/patchForce": "true",
         "storageclass.kubernetes.io/is-default-class": "true",
      },
      name: "gp2",
   },
   provisioner: "kubernetes.io/aws-ebs",
   parameters: {
      type: "gp2",
      fsType: "ext4",
   },
});

name: default-storage-classes
runtime: yaml
resources:
  gp2-storage-class:
    type: kubernetes:storage.k8s.io/v1:StorageClass
    properties:
      metadata:
        name: gp2
        annotations:
          pulumi.com/patchForce: "true"
          storageclass.kubernetes.io/is-default-class: "true"
      provisioner: kubernetes.io/aws-ebs
      parameters:
        type: gp2
        fsType: ext4

Prior art

Terraform’s k8s provider has limited support for patching k8s resources, which is exposed with purpose-built resources in their SDK. They currently support patching labels or annotations on any resource, or patching ConfigMap resources. These operations all require that the resource already exists, and was created by another controller. They support a “force” flag that works similarly to the proposed Force patch option. For deletes, they relinquish management of the specified fields, but don’t delete the resource.

By comparison, this proposal supports all resource kinds and fields. This proposal also supports an "upsert" workflow that does not require the resource to exist prior to running pulumi up. A combination of the upsert and patch operations give the user granular control over the intended update semantics.

Alternatives considered

We have worked on this problem off and on since 2018, but had not reached a satisfactory answer. Previous attempts were
based around CSA, which presents additional challenges for getting the current state of a resource, making atomic
updates to the resource, and handling conflicts with other controllers.

The leading candidate solution used a combination of resource methods, get and patch, to specify the desired state. This solution had several problems that stalled progress, which were documented in this update from January 2022. Additionally, this approach relies on resource methods, which are more complicated to implement cross-language, and are not currently supported in our YAML SDK.

Another alternative that was considered was doing the equivalent of kubectl apply, and not attempting to integrate
this tightly with the Pulumi model. This approach would have made it difficult to preview changes and understand the
state of the resources after the patch was applied. Like the previous solution, this was underpinned by CSA, which
significantly complicates the implementation. It is now possible to use kubectl apply in the SSA mode, which would
make this approach more viable. We previously suggested this as a workaround using the pulumi-command provider to
execute the apply commands, but at the cost of poor previews and unclear delete semantics.

We initially wanted to expose Patch behavior through the existing resource classes rather than creating new resource
classes specific to Patch. However, we discovered that this approach would not be possible to implement without a
breaking change to our existing SDKs.

Compatibility

For the initial rollout of patch support, we will allow users to opt in with a Provider feature flag. The existing
enableDryRun option will be deprecated in favor of a combined option that enables both Server-Side Diff and Server-Side Apply. Client-Side Apply will continue to be supported until the next major release of the k8s provider. That release will drop support for CSA and k8s versions older than v1.18.

Related issues

The text was updated successfully, but these errors were encountered:

AaronFriel · 2022-06-09T23:37:26Z

Appreciate the work that went into this!

SSA can be enabled by setting a Provider option. This option will be required to use Patch support. SSA support will be disabled by default until the next major release of pulumi-kubernetes.

If SSA is enabled, will users be able to create-or-patch via the normal resource constructors?

An example might be if they are deploying into a cluster that may or may not contain a CRD, and they would like to create-or-update it. An example that comes to mind is AWS EKS clusters created before 1.11 were created without storage classes and those created after will contain them. With SSA enabled would this allow me to create-or-update the storage class?

name: default-storage-classes
runtime: yaml
resources:
  gp2-storage-class:
    type: kubernetes:storage.k8s.io/v1:StorageClass
    properties:
      metadata:
        name: gp2
        annotations:
          storageclass.kubernetes.io/is-default-class: "true"
      provisioner: kubernetes.io/aws-ebs
      parameters:
        type: gp2
        fsType: ext4

AaronFriel · 2022-06-09T23:46:39Z

Perhaps a question for @rawkode, under server-side apply how will the API server handle keys that are removed on subsequent updates? If on first run I run this:

new k8s.core.v1.ConfigMapPatch("app-ns/app-config", {
    data: {
        foo: "bar",
        oops: "p@ssw0rd - this should have been in a Secret",
    }
});

And then I update the program like so:

new k8s.core.v1.ConfigMapPatch("app-ns/app-config", {
    data: {
        foo: "bar"
    }
});

Does server-side apply cause the extra key to be removed?

rawkode · 2022-06-10T00:02:27Z

@AaronFriel Yep. I run a test to confirm also:

bat config.yaml
───────┬────────────────────────────────────────────────────────────────────────────────────────
       │ File: config.yaml
───────┼────────────────────────────────────────────────────────────────────────────────────────
   1   │ apiVersion: v1
   2   │ kind: ConfigMap
   3   │ metadata:
   4   │   name: test-ssa
   5   │ data:
   6   │   key1: metal
   7   │   key2: lica
───────┴────────────────────────────────────────────────────────────────────────────────────────
❯ kubectl apply --server-side -f config.yaml
configmap/test-ssa serverside-applied
❯ kubectl describe configmap test-ssa
Name:         test-ssa
Namespace:    default
Labels:       <none>
Annotations:  <none>

Data
====
key1:
----
metal
key2:
----
lica

BinaryData
====

Events:  <none>
❯ bat config.yaml
───────┬────────────────────────────────────────────────────────────────────────────────────────
       │ File: config.yaml
───────┼────────────────────────────────────────────────────────────────────────────────────────
   1   │ apiVersion: v1
   2   │ kind: ConfigMap
   3   │ metadata:
   4   │   name: test-ssa
   5   │ data:
   6   │   key1: metal
───────┴────────────────────────────────────────────────────────────────────────────────────────
❯ kubectl apply --server-side -f config.yaml
configmap/test-ssa serverside-applied
❯ kubectl describe configmap test-ssa
Name:         test-ssa
Namespace:    default
Labels:       <none>
Annotations:  <none>

Data
====
key1:
----
metal

BinaryData
====

Events:  <none>

/var/folders/ws/7l_27y052233yjr50jwz3lv80000gn/T/tmp.1t9OA43J                          19:01:08
❯

viveklak · 2022-06-10T00:05:45Z

Perhaps a question for @rawkode, under server-side apply how will the API server handle keys that are removed on subsequent updates?

Yes - essentially while your field manager controls the field, you can assert its state freely.

AaronFriel · 2022-06-10T00:15:13Z

@lblackstone @viveklak what do you think about changing the last item about delete behavior then:

The retainOnDelete resource option will be true by default, but can be overridden by explicitly setting it to false. If retainOnDelete is false, then the shared resource will be deleted from the cluster when the stack is destroyed.

To:

When a Patch resource is deleted, an empty patch is sent which causes the resource in the cluster is retained.

This makes the distinction for the "patch" resource between the regular resources quite clear to me:

Normal resources are managed by the stack that created them, and the physical resource is deleted when the Pulumi resource is deleted unless retainOnDelete: true. (This is true even when SSA is enabled for normal resources.)
Patch resources are shared, the fields are managed by Pulumi, the destroy action is to send an empty patch which removes all managed fields, or no action if retainOnDelete: true.

viveklak · 2022-06-10T00:33:14Z

@AaronFriel yes I am mostly coming around to the same realization. On the question of retainOnDelete being a no-op, it would be nice to relinquish ownership of relevant fields in the case of the patch resources? I can't think of a way to do that and retain the existing values though so perhaps your suggestion is the best we can do. The other controller would have to force update these fields to wrest management.

AaronFriel · 2022-06-10T00:44:04Z

@viveklak what @rawkode wrote makes it sound like sending an empty patch as the manager relinquishes ownership of all fields.

To allow another controller to take ownership of the fields, I think we'd recommend a process similar to the Kubernetes docs here: https://kubernetes.io/docs/reference/using-api/server-side-apply/#transferring-ownership

I'd imagine patch args would look like so:

interface PatchArgs {
  force?: boolean,
  validate?: boolean, // implies force = true? I think?
  fieldManager?: string,
}

To follow the Kubernetes docs, one would:

new k8s.core.v1.DeploymentPatch("nginx-deployment", {
  metadata: {
    name: "nginx-deployment",
  },
  spec: {
    replicas: 3,
  },
}, { fieldManager: "handover-to-hpa", validate: false });

At which point they can then update their new k8s.core.v1.Deployment and remove the replicas field.

AaronFriel · 2022-06-10T00:44:53Z

And given what @rawkode just demonstrated and the above, I think I'd consider changing this item:

Each SSA-enabled Provider will set a unique manager ID to enable unambiguous management of the same resource from multiple Pulumi stacks.

To:

Each SSA-enabled Resource will set a unique manager ID to enable multiple patches to enable unambiguous management of the same resource.

The manager ID should be the URN of the resource or another stable but uniquely generated ID per resource.

Consider this program in which two engineers working on the same "core infra" stack for a cluster end up writing two separate patches to the same resource, which should be managed independently. We may want to patch the same resource twice so long as those patches to commute:

// file1.ts
new k8s.core.v1.NamespacePatch("add-monitor-label", {
  metadata: {
    name: "kube-public",
    labels: {
      "my-app.example.com/monitor": "true",
    }
  }
);


// file2.ts
new k8s.core.v1.NamespacePatch("add-public-ns-label-for-network-policy", {
  metadata: {
    name: "kube-public",
    labels: {  
      "my-app.example.com/is-kube-public": "true",
    }
  }
);

Or consider that a multi-language component means we may implicitly depend on a Patch resource not defined in our program, but using the same provider:

// index.ts:

const k8sProvider = new k8s.Provider(...);

// With SSA enabled, perform a create-or-update
const storageClass = new k8s.core.v1.StorageClass("default-gp2-class", {
  "metadata": {
    "name": "gp2",
    "annotations": {
      "storageclass.kubernetes.io/is-default-class": "true"
    }
  },
  "provisioner": "kubernetes.io/aws-ebs",
  "parameters": {
    "type": "gp2",
    "fsType": "ext4"
  }
}, { provider: k8sProvider });

// Import a component provider which labels a storage class as the default
new CustomStorageProvisioner("cephfs", {
  defaultStorageClassName: storageClass.metadata.name, // use the output to ensure ordering
}, { providers: [k8sProvider]});

// customStorageProvisioner.ts, which could be located in a multi-language component provider:

export class CustomStorageProvisioner extends pulumi.ComponentResource {
  constructor(name, opts) {
    
    new k8s.core.v1.StorageClassPatch("label-default-class", {
      metadata: {
        name: opts.defaultStorageClassName,
        labels: {
          customStorageProvisioner.example.com/sc-target: "true",
        },
      },
    }, { parent: this });
  }
}

lblackstone · 2022-06-10T00:50:31Z

Thanks for the write-up @lblackstone !

The resource name will be the name of the k8s resource to patch, and will be in the form [namespace/]name. For
example, a ConfigMap named app-config in the app-ns namespace, will be referenced as app-ns/app-config.

Curious why deviate from the existing naming model here for existing kubernetes resources where we use the metadata.name/namespace? It feels a bit restrictive to burden the pulumi resource name to be load-bearing within the context of the cluster resources. Could we just require that the metadata.name option instead?

Great point. I'll revise the proposal with that suggestion. I think it might make sense to require .metadata.name since auto-naming isn't supported for patch resources.

lblackstone · 2022-06-10T00:53:29Z

@lblackstone @viveklak what do you think about changing the last item about delete behavior then:

The retainOnDelete resource option will be true by default, but can be overridden by explicitly setting it to false. If retainOnDelete is false, then the shared resource will be deleted from the cluster when the stack is destroyed.

To:

When a Patch resource is deleted, an empty patch is sent which causes the resource in the cluster is retained.

This makes the distinction for the "patch" resource between the regular resources quite clear to me:

Normal resources are managed by the stack that created them, and the physical resource is deleted when the Pulumi resource is deleted unless retainOnDelete: true. (This is true even when SSA is enabled for normal resources.)

Patch resources are shared, the fields are managed by Pulumi, the destroy action is to send an empty patch which removes all managed fields, or no action if retainOnDelete: true.

The problem with this is that any fields which no longer have a manager revert to default values. This subtlety seems like it has the potential to cause unpleasant surprises.

As a concrete example, consider a k8s Deployment where we patch the number of replicas from a previous value of 3 to a new value of 5. Relinquishing control of the replicas field on deletion would reset the value to the default of 1 rather than putting it back to 3. We would need to accurately track the previous state and then do some update gymnastics on deletion to undo our changes. This would be different for each resource kind, and I don't think the previous state would always be desirable, since it could have changed since we checkpointed it. Ultimately, I think the proposed delete/retain choice is a better option since it's clear to explain and to implement.

lblackstone · 2022-06-10T00:59:56Z

Appreciate the work that went into this!

SSA can be enabled by setting a Provider option. This option will be required to use Patch support. SSA support will be disabled by default until the next major release of pulumi-kubernetes.

If SSA is enabled, will users be able to create-or-patch via the normal resource constructors?

An example might be if they are deploying into a cluster that may or may not contain a CRD, and they would like to create-or-update it. An example that comes to mind is AWS EKS clusters created before 1.11 were created without storage classes and those created after will contain them. With SSA enabled would this allow me to create-or-update the storage class?
name: default-storage-classes
runtime: yaml
resources:
  gp2-storage-class:
    type: kubernetes:storage.k8s.io/v1:StorageClass
    properties:
      metadata:
        name: gp2
        annotations:
          storageclass.kubernetes.io/is-default-class: "true"
      provisioner: kubernetes.io/aws-ebs
      parameters:
        type: gp2
        fsType: ext4 

I would like to support the "upsert" case, but didn't explicitly include it here. You'll notice in the flowchart that an error is returned if the resource doesn't exist.

The main complication is that a create requires a complete resource spec rather than a subset. Perhaps we could specify that with a patchOption flag, and document that it requires a valid spec even though the SDK types don't enforce required fields?

AaronFriel · 2022-06-10T01:02:37Z

@lblackstone @viveklak what do you think about changing the last item about delete behavior then:
...

The problem with this is that any fields which no longer have a manager revert to default values. This subtlety seems like it has the potential to cause unpleasant surprises.

As a concrete example, consider a k8s Deployment where we patch the number of replicas from a previous value of 3 to a new value of 5. Relinquishing control of the replicas field on deletion would reset the value to the default of 1 rather than putting it back to 3. We would need to accurately track the previous state and then do some update gymnastics on deletion to undo our changes. This would be different for each resource kind, and I don't think the previous state would always be desirable, since it could have changed since we checkpointed it. Ultimately, I think the proposed delete/retain choice is a better option since it's clear to explain and to implement.

I think that's OK or expected behavior for server-side apply. Leaving managed fields around that aren't truly managed seems to go against the Kubernetes model. If a user cares to ensure a field maintains a value after a stack is destroyed, we should recommend transferring ownership to another stack by setting retainOnDelete true while setting the manager ID to a well known value so that another stack can take it over: https://kubernetes.io/docs/reference/using-api/server-side-apply/#transferring-ownership

Our experienced Kubernetes practitioners who use Pulumi will be thankful that they can transfer knowledge of how server-side apply works. For folks who are surprised, we can say that our Patch resource follows the semantics of Kubernetes so closely that we can refer them to those docs as well as our own.

The parallel I'd apply here, in terms of "least surprise", is like that of how ConfigMap replaces & immutability work. If we deviate from the Kubernetes API server's model, we surprise Kubernetes practitioners. (In our defense though, it's really quite annoying to make sure ConfigMaps propagate updates correctly to deployments in other tools.)

AaronFriel · 2022-06-10T01:05:57Z

If SSA is enabled, will users be able to create-or-patch via the normal resource constructors?
...

I would like to support the "upsert" case, but didn't explicitly include it here. You'll notice in the flowchart that an error is returned if the resource doesn't exist.

The main complication is that a create requires a complete resource spec rather than a subset. Perhaps we could specify that with a patchOption flag, and document that it requires a valid spec even though the SDK types don't enforce required fields?

I think the upsert case would be handled by the "normal" resource when SSA is enabled, and whose type signature requires that a minimum valid object is specified.

The distinction then is:

Normal resources are owned by the stack/program that creates them. It's expected that you must use retainOnDelete if you don't want the resource to be destroyed when the stack is or the resource is deleted from the stack.
Patch resources contain fields owned by the stack/program that creates them. It's expected that you must use retainOnDelete to preserve those when destroying the stack or removing the resource definition from it, and use the patch options to set the manager ID so that another stack or controller can take ownership of them.

lblackstone · 2022-06-10T13:42:51Z

@AaronFriel

The manager ID should be the URN of the resource or another stable but uniquely generated ID per resource.

Updated the verbiage about manager IDs to be per-resource, and clarified that the ID could be set explicitly. This should support the two cases you highlighted.

@lblackstone @viveklak what do you think about changing the last item about delete behavior then:
...

The problem with this is that any fields which no longer have a manager revert to default values. This subtlety seems like it has the potential to cause unpleasant surprises.
As a concrete example, consider a k8s Deployment where we patch the number of replicas from a previous value of 3 to a new value of 5. Relinquishing control of the replicas field on deletion would reset the value to the default of 1 rather than putting it back to 3. We would need to accurately track the previous state and then do some update gymnastics on deletion to undo our changes. This would be different for each resource kind, and I don't think the previous state would always be desirable, since it could have changed since we checkpointed it. Ultimately, I think the proposed delete/retain choice is a better option since it's clear to explain and to implement.

I think that's OK or expected behavior for server-side apply. Leaving managed fields around that aren't truly managed seems to go against the Kubernetes model. If a user cares to ensure a field maintains a value after a stack is destroyed, we should recommend transferring ownership to another stack by setting retainOnDelete true while setting the manager ID to a well known value so that another stack can take it over: https://kubernetes.io/docs/reference/using-api/server-side-apply/#transferring-ownership

Our experienced Kubernetes practitioners who use Pulumi will be thankful that they can transfer knowledge of how server-side apply works. For folks who are surprised, we can say that our Patch resource follows the semantics of Kubernetes so closely that we can refer them to those docs as well as our own.

The parallel I'd apply here, in terms of "least surprise", is like that of how ConfigMap replaces & immutability work. If we deviate from the Kubernetes API server's model, we surprise Kubernetes practitioners. (In our defense though, it's really quite annoying to make sure ConfigMaps propagate updates correctly to deployments in other tools.)

You've convinced me on the delete behavior. I've updated the proposal with those changes.

I think the upsert case would be handled by the "normal" resource when SSA is enabled, and whose type signature requires that a minimum valid object is specified.

How would we resolve conflicts in this model? Normal resource classes don't currently include option for that. Perhaps by using a provider-specific resource option to add an equivalent patchOptions argument?

This could be a way to address pulumi/pulumi#3388 for the k8s provider, at least.

AaronFriel · 2022-06-10T14:41:39Z

Hmm, I didn't think about conflicts with the normal resource. Patch options or annotation side channel?

lblackstone · 2022-06-10T19:39:11Z

Hmm, I didn't think about conflicts with the normal resource. Patch options or annotation side channel?

Yeah, I like the idea of using annotations for now to enable upsert without a breaking change to the SDKs. I expect that we'd make this a first-class resource option as part of the next major release.

We've got prior art for several k8s-specific options that later made it into the SDK proper.

lblackstone · 2022-06-10T22:32:11Z

I realized that we don't have a way of adding a structured patchOptions argument without some additional changes to core Pulumi SDKs. As a workaround, we plan to use annotations for these options for both upsert and patch workflows. These options will be split out into structured types in a later release.

new k8s.apps.v1.DeploymentPatch("nginx-replicas", {
    metadata: {
        name: "nginx",
    },
    spec: {
        replicas: 3,
    }
}, {
    force: true,
    manager: "example",
});

will change to

new k8s.apps.v1.DeploymentPatch("nginx-replicas", {
    metadata: {
        annotations: {
            "pulumi.com/patchForce": "true",
            "pulumi.com/patchManager": "example",
        },
        name: "nginx",
    },
    spec: {
        replicas: 3,
    }
});

julienvincent · 2022-06-11T14:33:05Z

Maybe I missed it, but it seems there isn't a way to remove a field via a Patch resource.

For example, removing an annotation or a key from a ConfigMap.

I suppose this could be done with two applies (take control of the field and then remove it)?

AaronFriel · 2022-06-12T00:45:08Z

@julienvincent Good question, thanks for asking. Does Server-Side Apply enable taking ownership of a field and deleting it in a single apply?

If not, I think following this spec and @lblackstone's example, a single program might be able to implement the two patches and order them via dependsOn:

const removalManagerId = 'e861d8c8-62e5-48ff-b331-809dee820c50'; // arbitrary UUID

// take ownership of the field, set value to empty string
const step1 = new k8s.core.v1.ConfigMapPatch("take-ownership-of-field", {
    metadata: {
        name: "my-config-map",
        annotations: {
            "pulumi.com/patchForce": "true", // force the field into our control
            "pulumi.com/patchManager": removalManagerId,
        },
    },
    data: {
      "foobar": ""
    }
});

// Use the same manager and make a patch that omits the field, restoring it to the default value (absent)
new k8s.core.v1.ConfigMapPatch("nginx-replicas", {
    metadata: {
        name: "my-config-map",
        annotations: {
            "pulumi.com/patchManager": removalManagerId,
        },
    },
    data: {
    }
}, { dependsOn: step1 }); // Ensure that this runs after we've taken control of the resource

matthewriedel-flux · 2022-06-12T20:05:47Z

@lblackstone : Would this proposal also resolve the issue in #1118 which exposes secrets in the last applied state annotation?

lblackstone · 2022-06-13T16:13:07Z

@lblackstone : Would this proposal also resolve the issue in #1118 which exposes secrets in the last applied state annotation?

It resolves it for Providers that opt into SSA management since the last-applied-configuration annotation would no longer be used. I expect that SSA behavior will be the only supported method in the next major release of the provider, at which point the issue would be entirely resolved.

Add a mode to the provider that uses Server-Side Apply (SSA) for resource management. See #2011 for additional details on the background and design of this feature.

squaremo · 2022-06-22T14:01:35Z

3. Each patch will be represented in Pulumi SDKs as new resource classes corresponding to each k8s resource kind. Patch
resources will be named as <Resource>Patch, and will live in the same namespaces as the corresponding resource. For
example, apps.v1.Deployment will correspond to apps.v1.DeploymentPatch.
[...]
Unlike normal resource classes, every argument in a Patch resource will be optional except for .metadata.name. This
allows users to specify only the parts of the configuration that they want to patch.

It's worth thinking through how these types will be expressed in each language. I am thinking it could be tricky in Go, at least -- optional fields are pointer-valued so that "not supplied" can be distinguished from "supplied but the zero value", which becomes quite awkward. For reference, client-go had a similar problem: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/2155-clientgo-apply ("every field a pointer" is discussed there under alternatives).

(I am not that familiar with how Pulumi generates types for its SDKs at present, but I realise it doesn't reuse the types directly from the Kubernetes API packages. It may be that this problem is side-stepped in the generated types. In which case it's probably worth saying that, for the sake of onlookers :-))

lblackstone · 2022-06-22T14:57:26Z

@squaremo Thanks for bringing that up! I am planning to use Pulumi's existing schema-based code generation, which currently uses the "optional as pointer" method for Go. I previously looked into the builders that you referenced, and think it would be tricky to integrate with our existing stack for the following reasons:

The k8s provider uses the dynamic resource client, while the builders currently work only with the typed clients. It's possible that the dynamic client will be supported in the future, but it's not something I want to push forward myself.
Any SDK components that don't use our schema-based code generators would be inserted as overlays. This shifts the maintenance burden to the provider, and adds substantial cost and complexity to keep up to date with core improvements to the platform.

With that said, I agree that the optional pointer idiom can be unwieldy and potentially ambiguous, so we may need to investigate workarounds as we get some more mileage on the current approach. We're initially releasing patch support as an opt-in "developer preview" feature, so there will be an opportunity to refine the approach before the API is locked in.

Add a mode to the provider that uses Server-Side Apply (SSA) for resource management. See #2011 for additional details on the background and design of this feature.

Add Patch resources for all Kubernetes resource types that can be used to patch existing cluster resources rather than creating a new resource under Pulumi management. See #2011 for additional details on the background and design of this feature.

Add Patch resources for all Kubernetes resource types that can be used to patch existing cluster resources rather than creating a new resource under Pulumi management. See #2011 for additional details on the background and design of this feature. * Update SDK dep * Disable managed-by label for SSA * Remove managedFields from patch calculation * Require SSA mode for Patch resources * DryRun -> Preview * Improve error handling * Update some comments

squaremo · 2022-07-13T16:48:42Z

I wrote this program:

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		provider, err := k8s.NewProvider(ctx, "ssa", &k8s.ProviderArgs{
			Kubeconfig:            pulumi.StringPtr("/Users/mikeb/.kube/config"), // required :-/
			EnableServerSideApply: pulumi.Bool(true),
		})
		if err != nil {
			return err
		}

		conf, err := corev1.NewConfigMap(ctx, "bar", &corev1.ConfigMapArgs{
			Metadata: &metav1.ObjectMetaArgs{
				Name: pulumi.String("bar"),
			},
			Data: pulumi.StringMap{
				"bar": pulumi.String("boo"),
			},
		}, pulumi.Provider(provider))
		if err != nil {
			return err
		}

		ctx.Export("configmap", conf)

		_, err = corev1.NewConfigMapPatch(ctx, "bar-patch", &corev1.ConfigMapPatchArgs{
			Metadata: &metav1.ObjectMetaPatchArgs{
				Name: pulumi.String("bar"),
			},
			Data: pulumi.StringMap{
				"boo": pulumi.String("bing"),
			},
		}, pulumi.Provider(provider), pulumi.DependsOn([]pulumi.Resource{conf}))
		if err != nil {
			return err
		}

		return nil
	})
}

Naively, I'd expect this to create the ConfigMap then patch it with another data field; but it results in

kubernetes:core/v1:ConfigMapPatch (bar-patch):
     error: resource default/bar was not successfully created by the Kubernetes API server : configmaps "bar" already exists

Where did my thinking go wrong?

lblackstone · 2022-07-13T18:24:21Z

@squaremo I wasn't able to reproduce that error, and it worked as expected for me. Perhaps you have an older version of the resource plugin on your path?

squaremo · 2022-07-14T08:54:36Z

Perhaps you have an older version of the resource plugin on your path?

Hrm, well good guess -- pulumi-resource-kubernetes is not on my path at all. It is in ~/.pulumi/plugins/resource-kubernetes-v3.20.0/, which is the right version isn't it?

If I make install in this repo, then put $HOME/go/bin (which is where it puts the executable) on my $PATH, I can see it work too. But if I leave that off the path, it fails. Could the pulumi runtime be selecting a different version of the plugin?

(There is demonstrably a happy path, so I'm happy to accept this is WFM -- but it would also be good to confirm that released artifacts all work together correctly)

squaremo · 2022-07-14T09:08:38Z

Could the pulumi runtime be selecting a different version of the plugin?

With pulumi up --debug -v9 --logtostderr I see this log message:

I0714 09:58:23.042379   20036 plugins.go:130] gatherPluginsFromProgram(): plugin kubernetes 3.19.4 () is required by language host

Now I understand what is happening: I used pulumi new kubernetes-go for this particular project, and its (baked-in?) idea of the latest kubernetes SDK was 3.19.4 so that's what it put in go.mod. To make it work with my git clone, I used a replace directive, which allowed it to compile; but so far as the pulumi runtime was concerned, the version was still 3.19.4, and that's the plugin version it used.

OK, I'm satisfied this is not something an end user (or anyone not messing around with builds from the git repo) is going to encounter. Sorry everyone; as you were.

lblackstone · 2022-07-20T00:11:26Z

With the v3.20.1 release out now, I'm going to mark this as complete! 🎉

Check out the how-to guide for some examples to get started. Note that this feature is currently opt-in with a provider flag, but is expected to become the new default in the future.

Please open new issues if you run into any problems, and thanks again for everyone's input on this feature!

pulumi/pulumi-kubernetes#2011

This comment was marked as outdated.

Sign in to view

lblackstone pinned this issue Jun 10, 2022

jkisk mentioned this issue Jun 10, 2022

ignoreChanges has no effect when SSA is enabled #2012

Closed

lblackstone mentioned this issue Jun 10, 2022

Allow a model similar to kubectl apply -f #264

Closed

This was referenced Jun 14, 2022

[Epic: Patch]: Add Patch resources to SDKs #2015

Closed

[Epic: Patch] Add Server-Side Apply logic to the provider #2016

Closed

[Epic: Patch] Test Patch support #2017

Closed

[Epic: Patch] Add docs and examples #2018

Closed

lblackstone added a commit that referenced this issue Jun 16, 2022

Implement Server-Side Apply mode

6f755c7

Add a mode to the provider that uses Server-Side Apply (SSA) for resource management. See #2011 for additional details on the background and design of this feature.

lblackstone added a commit that referenced this issue Jun 21, 2022

Implement Server-Side Apply mode

97b2a68

Add a mode to the provider that uses Server-Side Apply (SSA) for resource management. See #2011 for additional details on the background and design of this feature.

lblackstone added a commit that referenced this issue Jun 23, 2022

Implement Server-Side Apply mode

2ca65c0

Add a mode to the provider that uses Server-Side Apply (SSA) for resource management. See #2011 for additional details on the background and design of this feature.

lblackstone added a commit that referenced this issue Jun 23, 2022

Implement Server-Side Apply mode (#2029)

3a48081

Add a mode to the provider that uses Server-Side Apply (SSA) for resource management. See #2011 for additional details on the background and design of this feature.

lblackstone mentioned this issue Jun 23, 2022

Add Patch resources to all SDKs #2043

Merged

lblackstone self-assigned this Jul 20, 2022

lblackstone added the resolution/fixed This issue was fixed label Jul 20, 2022

lblackstone closed this as completed Jul 20, 2022

pulumi-bot reopened this Jul 20, 2022

lblackstone added the kind/design An engineering design doc, usually part of an Epic label Jul 20, 2022

lblackstone closed this as completed Jul 20, 2022

pulumi deleted a comment from pulumi-bot Jul 20, 2022

lblackstone unpinned this issue Jul 20, 2022

AaronFriel mentioned this issue Jul 25, 2022

[Go] Convert for Kubernetes provider generates invalid code pulumi/pulumi-yaml#283

Closed

gitfool added a commit to gitfool/Pulumi.Dungeon that referenced this issue Jul 29, 2022

Enable server side apply

55d5451

pulumi/pulumi-kubernetes#2011

lblackstone mentioned this issue Dec 9, 2022

Retain Patch resources by default #2253

Closed

squaremo mentioned this issue Dec 12, 2022

Add SSA option to transfer field manager on delete #2149

Closed

rquitales mentioned this issue Mar 7, 2023

External changes to labels and annotations aren't reverted with server side apply. #2331

Closed

EronWright mentioned this issue Jan 16, 2024

Support for metadata.generateName (CSA) #2594

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Kubernetes Patch Support #2011

RFC: Kubernetes Patch Support #2011

lblackstone commented Jun 9, 2022 •

edited

Loading

This comment was marked as outdated.

AaronFriel commented Jun 9, 2022

AaronFriel commented Jun 9, 2022

rawkode commented Jun 10, 2022

viveklak commented Jun 10, 2022

AaronFriel commented Jun 10, 2022

viveklak commented Jun 10, 2022

AaronFriel commented Jun 10, 2022

AaronFriel commented Jun 10, 2022 •

edited

Loading

lblackstone commented Jun 10, 2022

lblackstone commented Jun 10, 2022

lblackstone commented Jun 10, 2022

AaronFriel commented Jun 10, 2022 •

edited

Loading

AaronFriel commented Jun 10, 2022 •

edited

Loading

lblackstone commented Jun 10, 2022 •

edited

Loading

AaronFriel commented Jun 10, 2022 •

edited

Loading

lblackstone commented Jun 10, 2022

lblackstone commented Jun 10, 2022

julienvincent commented Jun 11, 2022

AaronFriel commented Jun 12, 2022 •

edited

Loading

matthewriedel-flux commented Jun 12, 2022

lblackstone commented Jun 13, 2022

squaremo commented Jun 22, 2022 •

edited

Loading

lblackstone commented Jun 22, 2022

squaremo commented Jul 13, 2022 •

edited

Loading

lblackstone commented Jul 13, 2022

squaremo commented Jul 14, 2022

squaremo commented Jul 14, 2022

lblackstone commented Jul 20, 2022

RFC: Kubernetes Patch Support #2011

RFC: Kubernetes Patch Support #2011

Comments

lblackstone commented Jun 9, 2022 • edited Loading

Summary

Motivation

Proposal

Flowcharts

Upsert behavior

Patch behavior

SDK Examples

Change metadata of an existing Namespace

Update data in a shared ConfigMap

Change configuration of a CustomResource provisioned by an external controller

Change the number of Pod replicas in a shared Deployment

Ensure that a ConfigMap exists with a specified configuration

Ensure that a StorageClass exists with a specified configuration

Prior art

Alternatives considered

Compatibility

Related issues

This comment was marked as outdated.

AaronFriel commented Jun 9, 2022

AaronFriel commented Jun 9, 2022

rawkode commented Jun 10, 2022

viveklak commented Jun 10, 2022

AaronFriel commented Jun 10, 2022

viveklak commented Jun 10, 2022

AaronFriel commented Jun 10, 2022

AaronFriel commented Jun 10, 2022 • edited Loading

lblackstone commented Jun 10, 2022

lblackstone commented Jun 10, 2022

lblackstone commented Jun 10, 2022

AaronFriel commented Jun 10, 2022 • edited Loading

AaronFriel commented Jun 10, 2022 • edited Loading

lblackstone commented Jun 10, 2022 • edited Loading

AaronFriel commented Jun 10, 2022 • edited Loading

lblackstone commented Jun 10, 2022

lblackstone commented Jun 10, 2022

julienvincent commented Jun 11, 2022

AaronFriel commented Jun 12, 2022 • edited Loading

matthewriedel-flux commented Jun 12, 2022

lblackstone commented Jun 13, 2022

squaremo commented Jun 22, 2022 • edited Loading

lblackstone commented Jun 22, 2022

squaremo commented Jul 13, 2022 • edited Loading

lblackstone commented Jul 13, 2022

squaremo commented Jul 14, 2022

squaremo commented Jul 14, 2022

lblackstone commented Jul 20, 2022

lblackstone commented Jun 9, 2022 •

edited

Loading

AaronFriel commented Jun 10, 2022 •

edited

Loading

AaronFriel commented Jun 10, 2022 •

edited

Loading

AaronFriel commented Jun 10, 2022 •

edited

Loading

lblackstone commented Jun 10, 2022 •

edited

Loading

AaronFriel commented Jun 10, 2022 •

edited

Loading

AaronFriel commented Jun 12, 2022 •

edited

Loading

squaremo commented Jun 22, 2022 •

edited

Loading

squaremo commented Jul 13, 2022 •

edited

Loading