Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot update snapshot metadata #147

Closed
gman0 opened this issue Jul 18, 2019 · 22 comments
Closed

cannot update snapshot metadata #147

gman0 opened this issue Jul 18, 2019 · 22 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@gman0
Copy link

gman0 commented Jul 18, 2019

My attempt of creating a new VolumeSnapshot from a PVC source resulted in following error message in external-snapshotter:

I0717 18:09:11.377904       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateSnapshot
I0717 18:09:11.377925       1 connection.go:181] GRPC request: {"name":"snapshot-bd0540a0-1912-44f8-a04c-2e69ab1c21c6","secrets":"***stripped***","source_volume_id":"b44ffa38-5d65-4346-9265-807d9c966d6f"}
I0717 18:09:11.439795       1 reflector.go:235] github.com/kubernetes-csi/external-snapshotter/pkg/client/informers/externalversions/factory.go:117: forcing resync
I0717 18:09:11.577934       1 connection.go:183] GRPC response: {"snapshot":{"creation_time":{"seconds":1560416046},"ready_to_use":true,"size_bytes":1073741824,"snapshot_id":"bfe22f76-c3af-48a8-a326-6fdc3e5a747d","source_volume_id":"b44ffa38-5d65-4346-9265-807d9c966d6f"}}
I0717 18:09:11.579576       1 connection.go:184] GRPC error: <nil>
I0717 18:09:11.584450       1 snapshotter.go:81] CSI CreateSnapshot: snapshot-bd0540a0-1912-44f8-a04c-2e69ab1c21c6 driver name [nfs.manila.csi.openstack.org] snapshot ID [bfe22f76-c3af-48a8-a326-6fdc3e5a747d] time stamp [&{1560416046 0 {} [] 0}] size [1073741824] readyToUse [true]
I0717 18:09:11.584530       1 snapshot_controller.go:640] Created snapshot: driver nfs.manila.csi.openstack.org, snapshotId bfe22f76-c3af-48a8-a326-6fdc3e5a747d, creationTime 2019-06-13 08:54:06 +0000 UTC, size 1073741824, readyToUse true
I0717 18:09:11.584563       1 snapshot_controller.go:645] createSnapshot [default/new-nfs-share-snap]: trying to update snapshot creation timestamp
I0717 18:09:11.584604       1 snapshot_controller.go:825] updating VolumeSnapshot[]default/new-nfs-share-snap, readyToUse true, timestamp 2019-06-13 08:54:06 +0000 UTC
I0717 18:09:11.588727       1 snapshot_controller.go:650] failed to update snapshot default/new-nfs-share-snap creation timestamp: snapshot controller failed to update default/new-nfs-share-snap on API server: the server could not find the requested resource (put volumesnapshots.snapshot.storage.k8s.io new-nfs-share-snap)

The snapshot is successfully created by the driver, but external-snapshotter is having trouble updating the snapshot object metadata.

I'm using external-snapshotter v1.2.0-0-gb3f591d8 in k8s 1.15.0 running with VolumeSnapshotDataSource=true feature gate. The previous version of external-snapshotter, 1.1.0, works just fine though. Is this a regression or rather mis-configuration on my part? Always happy to debug more or provide more logs! Thanks!

@gman0
Copy link
Author

gman0 commented Jul 18, 2019

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 18, 2019
@xing-yang
Copy link
Collaborator

@gman0 Thanks for reporting the issue! Do you see this "failed to update snapshot default/new-nfs-share-snap creation timestamp" appear multiple times in the log? By default it should retry 5 times.

@xing-yang
Copy link
Collaborator

/assign @zhucan

@zhucan
Copy link
Member

zhucan commented Jul 19, 2019

@gman0 can paste your rbac yaml file for snapshot?

@gman0
Copy link
Author

gman0 commented Jul 19, 2019

@xing-yang here's the whole log https://pastebin.com/8TqMZqwt
@zhucan i've just copy-pasted RBAC rules from this repo

apiVersion: v1
kind: ServiceAccount
metadata:
  name: openstack-manila-csi-controllerplugin
  labels:
    app: openstack-manila-csi
    component: controllerplugin
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: openstack-manila-csi-controllerplugin
  labels:
    app: openstack-manila-csi
    component: controllerplugin
aggregationRule:
  clusterRoleSelectors:
    - matchLabels:
        rbac.manila.csi.openstack.org/aggregate-to-openstack-manila-csi-controllerplugin: "true"
rules: []
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: openstack-manila-csi-controllerplugin-rules
  labels:
    app: openstack-manila-csi
    component: controllerplugin
    rbac.manila.csi.openstack.org/aggregate-to-openstack-manila-csi-controllerplugin: "true"
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["secrets"]
    verbs: ["get", "list"]
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["list", "watch", "create", "update", "patch"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["csinodes"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshotcontents"]
    verbs: ["create", "get", "list", "watch", "update", "delete"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["snapshot.storage.k8s.io"]
    resources: ["volumesnapshots/status"]
    verbs: ["update"]
  - apiGroups: ["apiextensions.k8s.io"]
    resources: ["customresourcedefinitions"]
    verbs: ["create", "list", "watch", "delete", "get", "update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: openstack-manila-csi-controllerplugin
  labels:
    app: openstack-manila-csi
    component: controllerplugin
subjects:
  - kind: ServiceAccount
    name: openstack-manila-csi-controllerplugin
    namespace: default
roleRef:
  kind: ClusterRole
  name: openstack-manila-csi-controllerplugin
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: openstack-manila-csi-controllerplugin
  labels:
    app: openstack-manila-csi
    component: controllerplugin
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "watch", "list", "delete", "update", "create"]
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "list", "watch", "create", "delete"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: openstack-manila-csi-controllerplugin
  labels:
    app: openstack-manila-csi
    component: controllerplugin
subjects:
  - kind: ServiceAccount
    name: openstack-manila-csi-controllerplugin
    namespace: default
roleRef:
  kind: Role
  name: openstack-manila-csi-controllerplugin
  apiGroup: rbac.authorization.k8s.io

@xing-yang
Copy link
Collaborator

@gman0 In v1.2.0, we added a new feature to support status subresource. In order for that to work, you'll have to deploy the newer version of the snapshot CRD's. I think since you have already installed snapshot CRD v1.1.0 in your k8s cluster, it will not be installed again when you deploy external-snapshotter v1.2.0. Therefore the snapshot controller and the CRD's are out of sync.

Is it possible for you to restart your k8s cluster and deploy external-snapshotter 1.2.0+ and test again?

We'll fix this so it is backward compatible.

@zhucan
Copy link
Member

zhucan commented Jul 22, 2019

@gman0 @xing-yang I have tested it in v1.15.0, I have already installed snapshot( image's version is v1.0.1) CRD v1.0.1 in my k8s cluster and successful snapshot creation, but when I upgrade the image's version to v1.2.0, the CRD will not be upgraded, it used old CRD, so I delete the snapshot pod and old CRD and rebuild snapshot pod, It can works.

It's no needed to restart k8s cluster, only delete the crd.

@xing-yang Maybe we should delete the old CRD when creating snapshot pod that if the CRD version is different with the image's version?

@gman0
Copy link
Author

gman0 commented Jul 22, 2019

@xing-yang @zhucan ah I see, I'll try it out today, thanks!

@xing-yang
Copy link
Collaborator

@zhucan Sure. Please work on a fix. Thanks.

@gman0
Copy link
Author

gman0 commented Jul 22, 2019

@xing-yang @zhucan removed the volumesnapshot CRDs, all is working now! Thank you both! :) A fix would be nice though :p Closing.

@gman0 gman0 closed this as completed Jul 22, 2019
@xing-yang
Copy link
Collaborator

@gman0 I'm going to re-open this issue to keep track of the bug fix. Once it is fixed, we can close this again.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 21, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 20, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Sathishkunisai
Copy link

/reopen
ceph/ceph:v14.2.11
quay.io/cephcsi/cephcsi:v2.1.2
quay.io/k8scsi/csi-attacher:v2.1.0
quay.io/k8scsi/csi-node-driver-registrar:v1.2.0
quay.io/k8scsi/csi-provisioner:v1.6.0
quay.io/k8scsi/csi-resizer:v0.4.0
quay.io/k8scsi/csi-snapshotter:v2.1.0
rook/ceph:v1.3.9

kubectl volumensnapshots shows "ReadyToUse false". Please suggest whether the version of snapshotter I am using is correct,

@k8s-ci-robot
Copy link
Contributor

@Sathishkunisai: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
ceph/ceph:v14.2.11
quay.io/cephcsi/cephcsi:v2.1.2
quay.io/k8scsi/csi-attacher:v2.1.0
quay.io/k8scsi/csi-node-driver-registrar:v1.2.0
quay.io/k8scsi/csi-provisioner:v1.6.0
quay.io/k8scsi/csi-resizer:v0.4.0
quay.io/k8scsi/csi-snapshotter:v2.1.0
rook/ceph:v1.3.9

kubectl volumensnapshots shows "ReadyToUse false". Please suggest whether the version of snapshotter I am using is correct,

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xing-yang
Copy link
Collaborator

I don't think what you had is the same problem reported by this very old issue.
Can you try v2.1.1? We are actually going to cut a 2.2.0 release very soon.

@Sathishkunisai
Copy link

Sathishkunisai commented Sep 4, 2020

Thanks @xing-yang . To give more insight, my cluster is running in Azure Kubernetes (AKS), using Azure Disk to mount rook volumes (OSD) with feature gate is enabled in API Server [ volumesnapshot=true ] . Followed the procedure here in the link as well.
https://github.com/kubernetes-sigs/azuredisk-csi-driver/blob/master/docs/install-csi-driver-master.md

apart from that CSI_ENABLE_SNAPSHOTTER: "true" is enabled in operator.yaml

what am i missing ? Please advise

@Sathishkunisai
Copy link

Sathishkunisai commented Sep 4, 2020

@xing-yang upgraded to 2.1.1 result is same

NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE

datadir-mongo-pvc-snapshot-15 false datadir-mongo-replica-mongodb-replicaset-0 csi-rbdplugin-snapclass 3h7m
datadir-mongo-pvc-snapshot-17 false datadir-mongo-replica-mongodb-replicaset-1 csi-rbdplugin-snapclass 28s

@Sathishkunisai
Copy link

apiVersion: snapshot.storage.k8s.io/v1beta1
deletionPolicy: Delete
driver: rook-ceph.rbd.csi.ceph.com
kind: VolumeSnapshotClass
metadata
name: csi-rbdplugin-snapclass
parameters:
clusterID: rook-ceph
csi.storage.k8s.io/snapshotter-secret-name: rook-ceph-csi
csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph

@Madhu-1
Copy link
Contributor

Madhu-1 commented Sep 4, 2020

@Sathishkunisai as this is more specific to ceph I would suggest you open issue in a rook and try to get help in rook slack

xing-yang added a commit to xing-yang/external-snapshotter that referenced this issue Jul 20, 2021
c0a4fb1 Merge pull request kubernetes-csi#164 from anubha-v-ardhan/patch-1
9c6a6c0 Master to main cleanup
682c686 Merge pull request kubernetes-csi#162 from pohly/pod-name-via-shell-command
36a29f5 Merge pull request kubernetes-csi#163 from pohly/remove-bazel
68e43ca prow.sh: remove Bazel build support
c5f59c5 prow.sh: allow shell commands in CSI_PROW_SANITY_POD
71c810a Merge pull request kubernetes-csi#161 from pohly/mock-test-fixes
9e438f8 prow.sh: fix mock testing
d7146c7 Merge pull request kubernetes-csi#160 from pohly/kind-update
4b6aa60 prow.sh: update to KinD v0.11.0
7cdc76f Merge pull request kubernetes-csi#159 from pohly/fix-deployment-selection
ef8bd33 prow.sh: more flexible CSI_PROW_DEPLOYMENT, part II
204bc89 Merge pull request kubernetes-csi#158 from pohly/fix-deployment-selection
61538bb prow.sh: more flexible CSI_PROW_DEPLOYMENT
2b0e6db Merge pull request kubernetes-csi#157 from humblec/csi-release
a2fcd6d Adding myself to csi reviewers group
f325590 Merge pull request kubernetes-csi#149 from pohly/cluster-logs
4b03b30 Merge pull request kubernetes-csi#155 from pohly/owners
a6453c8 owners: introduce aliases
ad83def Merge pull request kubernetes-csi#153 from pohly/fix-image-builds
5561780 build.make: fix image publishng
29bd39b Merge pull request kubernetes-csi#152 from pohly/bump-csi-test
bc42793 prow.sh: use csi-test v4.2.0
b546baa Merge pull request kubernetes-csi#150 from mauriciopoppe/windows-multiarch-args
bfbb6f3 add parameter base_image and addon_image to BUILD_PARAMETERS
2d61d3b Merge pull request kubernetes-csi#151 from humblec/cm
48e71f0 Replace `which` command ( non standard)  with `command -v` builtin
feb20e2 prow.sh: collect cluster logs
7b96bea Merge pull request kubernetes-csi#148 from dobsonj/add-checkpathcmd-to-prow
2d2e03b prow.sh: enable -csi.checkpathcmd option in csi-sanity
09d4151 Merge pull request kubernetes-csi#147 from pohly/mock-testing
74cfbc9 prow.sh: support mock tests
4a3f110 prow.sh: remove obsolete test suppression
6616a6b Merge pull request kubernetes-csi#146 from pohly/kubernetes-1.21
510fb0f prow.sh: support Kubernetes 1.21
c63c61b prow.sh: add CSI_PROW_DEPLOYMENT_SUFFIX
51ac11c Merge pull request kubernetes-csi#144 from pohly/pull-jobs
dd54c92 pull-test.sh: test importing csi-release-tools into other repo
7d2643a Merge pull request kubernetes-csi#143 from pohly/path-setup
6880b0c prow.sh: avoid creating paths unless really running tests

git-subtree-dir: release-tools
git-subtree-split: c0a4fb1
xing-yang pushed a commit to xing-yang/external-snapshotter that referenced this issue Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants