Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PV stuck in status "Terminating" after deleting corresponding PVC #721

Closed
matschen opened this issue Jul 24, 2024 · 2 comments · Fixed by #725
Closed

PV stuck in status "Terminating" after deleting corresponding PVC #721

matschen opened this issue Jul 24, 2024 · 2 comments · Fixed by #725

Comments

@matschen
Copy link

matschen commented Jul 24, 2024

What happened:
PV stuck in status "Terminating" after deleting corresponding PVC(reclaimPolicy of StorageClass is set to Delete).

What you expected to happen:
Automatically created PV should also be deleted after corresponding PVC is deleted.

How to reproduce it:
Create a storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-csi
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: nfs.csi.k8s.io
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  server: nas.example.com
  share: /nds/testgrid/maps/storage

Run an app that will create PVC such as:

helm install gitea gitea-charts/gitea --values values_gitea.yaml

Then the following PVCs and PVs are created:

jiangzhenbing@prd37:~$ kubectl get pvc
NAME                              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-gitea-postgresql-0           Bound    pvc-faf4c83f-2b6e-4bfe-8a9d-cc2ac2ddecab   10Gi       RWO            nfs-csi        2m4s
gitea-shared-storage              Bound    pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a   10Gi       RWO            nfs-csi        2m5s
redis-data-gitea-redis-master-0   Bound    pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a   8Gi        RWO            nfs-csi        2m4s
jiangzhenbing@prd37:~$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                     STORAGECLASS   REASON   AGE
pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a   10Gi       RWO            Delete           Bound    default/gitea-shared-storage              nfs-csi                 2m1s
pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a   8Gi        RWO            Delete           Bound    default/redis-data-gitea-redis-master-0   nfs-csi                 119s
pvc-faf4c83f-2b6e-4bfe-8a9d-cc2ac2ddecab   10Gi       RWO            Delete           Bound    default/data-gitea-postgresql-0           nfs-csi                 119s

Then uninstall this app and delete PVC manually:

jiangzhenbing@prd37:~$ kubectl delete pvc data-gitea-postgresql-0 gitea-shared-storage redis-data-gitea-redis-master-0
persistentvolumeclaim "data-gitea-postgresql-0" deleted
persistentvolumeclaim "gitea-shared-storage" deleted
persistentvolumeclaim "redis-data-gitea-redis-master-0" deleted

PV will stuck in Terminating status:

jiangzhenbing@prd37:~$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS        CLAIM                                     STORAGECLASS   REASON   AGE
pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a   10Gi       RWO            Delete           Terminating   default/gitea-shared-storage              nfs-csi                 4m46s
pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a   8Gi        RWO            Delete           Terminating   default/redis-data-gitea-redis-master-0   nfs-csi                 4m44s
pvc-faf4c83f-2b6e-4bfe-8a9d-cc2ac2ddecab   10Gi       RWO            Delete           Terminating   default/data-gitea-postgresql-0           nfs-csi                 4m44s

Check the log of csi-nfs-controller:

E0724 06:58:35.690387       1 controller.go:1025] error syncing volume "pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a": persistentvolumes "pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a" is forbidden: User "system:serviceaccount:kube-system:csi-nfs-controller-sa" cannot patch resource "persistentvolumes" in API group "" at the cluster scope
I0724 06:58:35.967933       1 controller.go:1599] "Failed to remove finalizer for persistentvolume" PV="pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a" err="persistentvolumes \"pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a\" is forbidden: User \"system:serviceaccount:kube-system:csi-nfs-controller-sa\" cannot patch resource \"persistentvolumes\" in API group \"\" at the cluster scope"
I0724 06:58:35.967978       1 controller.go:1007] "Retrying syncing volume" key="pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a" failures=10
E0724 06:58:35.968014       1 controller.go:1025] error syncing volume "pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a": persistentvolumes "pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a" is forbidden: User "system:serviceaccount:kube-system:csi-nfs-controller-sa" cannot patch resource "persistentvolumes" in API group "" at the cluster scope

ClusterRole nfs-external-provisioner-role indeed has no permission Patch on persistentvolumes.
https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/deploy/rbac-csi-nfs.yaml
Anything else we need to know?:

Environment:

  • CSI Driver version:
jiangzhenbing@prd37:~$ kubectl get po -n kube-system -o yaml | grep gcr | grep nfs
      image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary
      image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary
      imageID: gcr.io/k8s-staging-sig-storage/nfsplugin@sha256:47d6a505dd9358ffcb865a4bb9e562b10cdd3645fbcdca7bbe5cce50af034c6a
      image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary
      image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary
      imageID: gcr.io/k8s-staging-sig-storage/nfsplugin@sha256:47d6a505dd9358ffcb865a4bb9e562b10cdd3645fbcdca7bbe5cce50af034c6a
      image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary
      image: gcr.io/k8s-staging-sig-storage/nfsplugin:canary
      imageID: gcr.io/k8s-staging-sig-storage/nfsplugin@sha256:47d6a505dd9358ffcb865a4bb9e562b10cdd3645fbcdca7bbe5cce50af034c6a
  • Kubernetes version (use kubectl version): v1.24.3
  • OS (e.g. from /etc/os-release): Ubuntu 22.04
  • Kernel (e.g. uname -a): Linux prd37 5.19.0-32-generic #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Jan 30 17:03:34 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: kubectl apply
  • Others:
@andyzhangx
Copy link
Member

@matschen
it's related to kubernetes-csi/external-provisioner#1235 (comment), could you try the workaround mentioned in that github issue I provided, thx

disable HonorPVReclaimPolicy feature gate in csi-provisioner should fix the issue.

@matschen
Copy link
Author

matschen commented Jul 29, 2024

@andyzhangx
The command kubectl patch pv <pv-name> -p '{"metadata":{"finalizers":null}}' set finalizer to null so the PV can be successfully deleted. But it's not the same issue as in kubernetes-csi/external-provisioner#1235
The key point of this issue is: ServiceAccount csi-nfs-controller-sa which bound to ClusterRole nfs-external-provisioner-role have no verb patch, so the external-provisioner cannot patch the finalizer after volume delete, then PV stuck in Terminating state.

E0724 06:58:35.690387       1 controller.go:1025] error syncing volume "pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a": persistentvolumes "pvc-a8f4852f-70b2-4f48-ae2f-6c81862dda2a" is forbidden: User "system:serviceaccount:kube-system:csi-nfs-controller-sa" cannot patch resource "persistentvolumes" in API group "" at the cluster scope
I0724 06:58:35.967933       1 controller.go:1599] "Failed to remove finalizer for persistentvolume" PV="pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a" err="persistentvolumes \"pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a\" is forbidden: User \"system:serviceaccount:kube-system:csi-nfs-controller-sa\" cannot patch resource \"persistentvolumes\" in API group \"\" at the cluster scope"
I0724 06:58:35.967978       1 controller.go:1007] "Retrying syncing volume" key="pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a" failures=10
E0724 06:58:35.968014       1 controller.go:1025] error syncing volume "pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a": persistentvolumes "pvc-a4b2a0da-2664-4336-b5e7-6e2271e21f8a" is forbidden: User "system:serviceaccount:kube-system:csi-nfs-controller-sa" cannot patch resource "persistentvolumes" in API group "" at the cluster scope

I noticed that patch is added to ClusterRole external-provisioner-runner in repo external-provisioner by commit kubernetes-csi/external-provisioner@c597852.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants