-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silences are not propagated in a ha/mesh configuration (v0.15.0-rc1) #1312
Comments
What are you setting as your There's also a lookup error, so it could be related to #1307 |
Humm, might be related indeed, I don't have that error with 0.14.0. Looking at the generated args from the prometheus-operator, they are indeed different: v0.14.0
v0.15.0-rc.1
|
@gmauleon your status page shows that the cluster is up and running so it is weird that the silences aren't propagated. I've tested in my local env (without the Prometheus operator but very similar setup with Statefulsets) and I can't reproduce it. Maybe you could share the statefulset definition which is generated by the operator? |
@brancz @fabxc have you all encountered anything like this using prometheus operator? Maybe one of you has a chance to take a look, i don't have access to a k8s cluster using this. @gmauleon based on the status page, it does look like it's connected to a peer ... can you specify |
I ran into this same issue last night when running v0.18.0 of prometheus-operator/kube-prometheus (on a K8s cluster in AWS) - with my own modified copy of manifests/alertmanager.yaml to change the version of alertmanager being used to v0.15.0-rc.0. But then I switched to using v0.15.0-rc.1 and everything worked. So perhaps a change was made in (Reference info: the Support Alertmanager v0.15.0 PR was merged to master on 3/22, thus it was included when Cut 0.18.0 happened on 4/4.) |
Sorry guys couldn't find the time to test further today (stuart suggestion) . Will look into it worst case by Monday evening. And in my case I was indeed testing with rc1 |
I am able to reproduce this issue with:
I will look further into this. Eventually we should add a test AddingSilenceCheckIfPropagated to the Prometheus operator e2e test suite. @gmauleon Thanks a lot for reporting this. |
Another thing that was interesting for me is I didn't encounter this issue at all with minikube - not even when I was using rc0. |
@gmauleon prometheus-operator/prometheus-operator#1193 should fix the issue.
@jolson490 That is very surprising to me. This should have never worked, even on minikube. |
Thanks! |
@mxinden all for this, let's make it happen. |
What did you do?
Create a 2 replicas alertmanager setup.
Create a silence in alertmanager from the exposed UI (silenced the default "DeadManSwitch")
What did you expect to see?
All alertmanagers in the mesh should have the silence set
What did you see instead? Under which circumstances?
Only one of the 2 replicas seems to have the silent set
Environment
prometheus-operator: v0.18.0
alertmanager: v0.15.0-rc.1
prometheus: v2.2.1
Notes
Reverted back to alertmanager v0.14.0 and it was working properly.
Sorry in advance if this is already on the radar.
Thanks guys.
The text was updated successfully, but these errors were encountered: