-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[metrics 2/x] Configure Prometheus Operator #687
[metrics 2/x] Configure Prometheus Operator #687
Conversation
Thanks for your PR,
To skip the vendors CIs use one of:
|
06e880c
to
75e8305
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
75e8305
to
06c86e1
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Pull Request Test Coverage Report for Build 9903758109Details
💛 - Coveralls |
06c86e1
to
37fbdf4
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
37fbdf4
to
7999f7e
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
7999f7e
to
58b9fb3
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
@@ -18,6 +18,7 @@ spec: | |||
name: sriov-network-metrics | |||
port: {{ .MetricsExporterPort }} | |||
targetPort: {{ .MetricsExporterPort }} | |||
{{ if .IsOpenshift }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we can also support k8s
lets check if the ServiceMonitor CRD exist in the cluster and deploy it instead of checking only for openshift WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, working on that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I used the unstructured client to check if the ServiceMonitore resource definition is available in the cluster. Does it sound good?
58b9fb3
to
9b5e4cf
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
9b5e4cf
to
2bddf42
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
2bddf42
to
cec70bd
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
cec70bd
to
6da499a
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
6da499a
to
e87a82d
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
deploy/role.yaml
Outdated
@@ -32,6 +32,7 @@ rules: | |||
verbs: | |||
- get | |||
- create | |||
- list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need this one also under the config folder so it will be generated for the bundle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
controllers/helper.go
Outdated
|
||
func isPrometheusOperatorInstalled(ctx context.Context, client k8sclient.Reader) bool { | ||
u := &uns.UnstructuredList{} | ||
u.SetGroupVersionKind(schema.GroupVersionKind{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking (maybe that is not the right way) to do a kubectl get crd servicemonitor not to search if there is any server monitor object in the cluster :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can try getting the CRD (e.g. see this).
The drawback is that we have to add the permission (ClusterRole,ClusterRoleBinding,...) to make the operator read that CustomResourceDefinition resource, but it might end up cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be ok to add get
for CRD that should not expose the operator to any security issues :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add the permission to the ClusterRole (instead of Role), as the CustomResourceDefinition is not namespaced.
For the same reason, I had to add a non-namespace client to the SriovOperatorConfigReconicler.
please, take a look
} | ||
|
||
if r.PlatformHelper.IsOpenshiftCluster() { | ||
err = utils.AddLabelToNamespace(ctx, vars.Namespace, "openshift.io/cluster-monitoring", "true", r.Client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to put this in the namespace creation template and not let the operator have permission to upgrade namespace that sounds a security risk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Maybe we can leverage the operatorframework.io/cluster-monitoring
annotation in the CSV, in the openshift fork
@zeeke can you rebase this one ? |
e87a82d
to
17dabd0
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
7aacaae
to
2faab1d
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
2faab1d
to
19e5be7
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
19e5be7
to
41106a5
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
@adrianchiris , @SchSeba please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some small comments :)
subjects: | ||
- kind: ServiceAccount | ||
name: prometheus-k8s | ||
namespace: openshift-monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is good for openshift on vanilla k8s please add a variable to the helmchart something like
https://github.com/metallb/metallb/blob/21dd75560f3b8614c14b1bb55a79dbcc231e36a7/charts/metallb/templates/servicemonitor.yaml#L192
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added environment variables and Helm stuff to make the subject configurable.
pkg/utils/cluster.go
Outdated
@@ -161,3 +161,28 @@ func AnnotateNode(ctx context.Context, nodeName string, key, value string, c cli | |||
|
|||
return AnnotateObject(ctx, node, key, value, c) | |||
} | |||
|
|||
func AddLabelToNamespace(ctx context.Context, namespaceName, key, value string, c client.Client) error { | |||
ns := &corev1.Namespace{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not able to find where we use this one.
in general I think we should document the need to add a label for monitoring on namespace creation and not add a rbac to allow the operator to update namespace object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the function. I will re-add it in the e2e test PR.
About permissions, the operator already had the RBAC to write on namespaces (see deploy/clusterrole.yaml and openshift CSV). No permission has been added in this PR for namespaces.
I'll take care of documenting the namespace configuration in OpenShift
Thanks for your PR,
To skip the vendors CIs use one of:
|
a39a440
to
108ef15
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
@adrianchiris , @SchSeba please take another look |
hack/env.sh
Outdated
@@ -41,3 +41,5 @@ export DEV_MODE=${DEV_MODE:-"FALSE"} | |||
export OPERATOR_LEADER_ELECTION_ENABLE=${OPERATOR_LEADER_ELECTION_ENABLE:-"false"} | |||
export METRICS_EXPORTER_SECRET_NAME=${METRICS_EXPORTER_SECRET_NAME:-"metrics-exporter-cert"} | |||
export METRICS_EXPORTER_PORT=${METRICS_EXPORTER_PORT:-"9110"} | |||
export METRICS_EXPORTER_PROMETHEUS_OPERATOR_SERVICE_ACCOUNT=${METRICS_EXPORTER_PROMETHEUS_OPERATOR_SERVICE_ACCOUNT:-"prometheus-k8s"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to override this one in the openshift CI file no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, moving there
19cc04e
to
83801a5
Compare
I got rid of the CustomResourceDefinition clusterrole access and now the installation of the Prometheus operator is inferred by the presence of the @adrianchiris @SchSeba please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! much cleaner now thanks for working on this!
83801a5
to
4af0a6a
Compare
Package `github.com/prometheus-operator/prometheus-operator/pkg/client` can be used for testing purpose. Signed-off-by: Andrea Panattoni <[email protected]>
Deploy the needed configuration to make the prometheus operator to find and scrape the sriov-network-metrics-exporter endpoints, including the ServiceMonitor, Role and RoleBinding. Resources are installed only if the Prometheus operator is installed. When useing `ServiceMonitors`, Prometheus Operator needs permissions to read Services,Endpoint and Pods in the monitored namespace (i.e. the SRIOV operator ns). Make the ServiceAccount subject configurable via environment variables. Signed-off-by: Andrea Panattoni <[email protected]>
4af0a6a
to
3dff029
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a small nit
@@ -137,6 +138,10 @@ var _ = BeforeSuite(func() { | |||
Expect(err).NotTo(HaveOccurred()) | |||
err = os.Setenv("METRICS_EXPORTER_KUBE_RBAC_PROXY_IMAGE", "mock-image") | |||
Expect(err).NotTo(HaveOccurred()) | |||
err = os.Setenv("METRICS_EXPORTER_PROMETHEUS_OPERATOR_SERVICE_ACCOUNT", "k8s-prometheus") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need the new variable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, I set it on the fly here:
https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/687/files#diff-35c949584634dae6fb47d82556f92d648986892e843a64d1eb9bd294375a61dcR370
So I can test both values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGtM!
Deploy the needed configuration to make the prometheus
operator to find and scrape the sriov-network-metrics-exporter
endpoints, including the ServiceMonitor, Role and RoleBinding
depends on:
sriov-network-metrics-exporter
#655