Too many logs produced by container #564

deyanp · 2021-06-30T16:58:28Z

Have you

Read Troubleshooting Guide
Searched on GitHub issues and Discussions
Yes

What steps did you take and what happened:
Looked at the Log Analytics -> Logs with the following query:

ContainerLog
| summarize count() by replace('xxx.azurecr.io/', '', Image)
| order by count_ desc

and saw on the top 2 places:

These seems to be because of the Secret Rotation interval of 2m (I have this active) and probably due to the fact that I have more than 50 pods in my cluster using CSI/Aad Pod Identity.

What did you expect to happen:
Wondering how to decrease the number of logs ...

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
No

Which access mode did you use to access the Azure Key Vault instance:
[e.g. Service Principal, Pod Identity, User Assigned Managed Identity, System Assigned Managed Identity]
Pod Identity

Environment:

Secrets Store CSI Driver version: (use the image tag): 2.2.0
Azure Key Vault provider version: (use the image tag): 0.0.15
Kubernetes version: (use kubectl version and kubectl get nodes -o wide): 1.20.5
Cluster type: (e.g. AKS, aks-engine, etc): AKS

The text was updated successfully, but these errors were encountered:

nilekhc · 2021-06-30T17:09:13Z

Hi @deyanp Thanks for reporting issue. Could you tell what log level you are currently using? Default is 0. You can set log verbosity while installing via helm chart with --set=logVerbosity flag.

deyanp · 2021-06-30T17:18:17Z

@nilekhc I am installing using yaml, and inside I dont see this arg specified ..

nilekhc · 2021-06-30T17:24:28Z

You can use arg -v in yaml. For eg. -v=10

deyanp · 2021-07-01T06:55:39Z

thanks, @nilekhc , I will put an arg -v=X, but how to choose the right value, so that:

I do not get the standard info messages that are currently filling the logs
I still get warnings, errors and important/critical messages?

How did you come up with the -v=10? Why not 9 or 11?

deyanp · 2021-07-01T06:59:48Z

@nilekhc , note that I read already #387, but it did not give me any hope that I can configure properly the log level with -v ... there seems to be no solution to only show warnings/errors but not info/debug??

deyanp · 2021-07-01T07:06:06Z

also this one provides no solution: #358

nilekhc · 2021-07-01T17:14:32Z

@deyanp -v=10 was just eg to try out different logging levels which might be applicable for your scenario. Also it looks like there is an open issue about this is klog.

deyanp · 2021-07-01T17:36:21Z

@nilekhc , you know that -v=10 will make it worse, right? The higher the more logs will be included, this is opposite to what I am used to in other logging libraries ...

deyanp · 2021-07-04T12:03:09Z

This is really not nice, and leaving 2 options to the customers:

Higher costs for Azure Log Analytics (we are speaking about hundreds of EUR here, not about a few EUR!)
Increase the --rotation-poll-interval, however this exposes us more to the other issue Pod Env Vars from SecretProviderClass (envFrom+secretRef) not updated upon deployment #368 , see Pod Env Vars from SecretProviderClass (envFrom+secretRef) not updated upon deployment #368 (comment) ...

Not nice ...

aramase · 2021-07-12T13:35:14Z

@deyanp Thank you for the feedback. The current logs generated for the provider are for each secret/key/cert object defined in the SecretProviderClass. It generates 3 logs per object which provides info on auth, success/ error response which makes debugging easier in case of issues. There is no difference in the grpc calls from driver to provider for initial mount/rotation call which is why the logs are consistent across all calls. We can certainly explore increasing the log verbosity for 2 out of 3 logs for each secret/key/cert object but that will make debugging a little harder. In case of errors, you would need to explicitly increase the provider log verbosity and re-enable. The correct long term solution will be the ability to set the log threshold and only get the error/warning logs which is tracked here.

deyanp · 2021-07-19T13:18:10Z

@aramase If I set the level to error I expect that errors are in the logs but informational messages not ... Not really sure why would I need an informational log every 2 minutes that secret x has been refreshed ... Actually currently the informational logs are written regardless of the secret was changed or not - I would expect that only if underlying secret changes then I will have a log message saying this secret was refreshed. Why would I need an info log every 2 minutes that the same secret (unchanged) has been refreshed?

github-actions · 2021-08-03T00:03:16Z

This issue is stale because it has been open 14 days with no activity. Please comment or this will be closed in 7 days.

pierluigilenoci · 2021-08-03T21:36:46Z

@deyanp the problem is that the bug is inside the Klog library that many of the Kubernetes components use. To solve the problem, Klog needs to be fixed.

Ref: kubernetes/klog#212

deyanp · 2021-08-04T07:23:54Z

@pierluigilenoci , check the issue you referenced yourself - dims there is implying the issue is not with klog but with the client applications (secrets-store-csi-driver-provider-azure) ..

pierluigilenoci · 2021-08-11T10:11:31Z

@deyanp I have read the Klog code and I have also proposed a PR.
I assure you that the error is on the Klog side. 😉

github-actions · 2021-08-26T00:03:38Z

This issue is stale because it has been open 14 days with no activity. Please comment or this will be closed in 7 days.

pierluigilenoci · 2021-08-26T09:09:14Z

🚀

github-actions · 2021-09-10T00:05:59Z

This issue is stale because it has been open 14 days with no activity. Please comment or this will be closed in 7 days.

deyanp · 2021-09-10T06:40:01Z

@pierluigilenoci could you then pls answer the question of/convince dims in issue kubernetes/klog#212 that this is an issue in klog, and that your PR makes sense?

pierluigilenoci · 2021-09-10T07:15:04Z

@deyanp I've already tried.

They basically said it is a known bug but there is no intention to fix it because it would change the behavior of the logger for practically all the components of Kubernetes and must be fixed while maintaining backward compatibility. Obviously, this is not possible because it works badly and to fix it the behavior must necessarily be changed and it is not possible to do this. So we're stuck.

pierluigilenoci · 2021-09-10T07:15:47Z

As soon as I have a few hours of time I will try to find a solution that is backward compatible but I don't promise anything. 😞

deyanp · 2021-09-10T14:48:34Z

As soon as I have a few hours of time I will try to find a solution that is backward compatible but I don't promise anything.

@pierluigilenoci , appreciated, I understand that you may not have time for this!

@aramase as a workaround, if I change --rotation-poll-interval from 2 to 30 minutes for example (to work around the problem with too many logs in this way) will I hit again the other issue where your last comment was "We'll also explore what it would mean to enable the Kubernetes secret update during new deployments with the rotation feature also enabled.", or has that been solved already?

deyanp · 2021-09-21T11:35:02Z

Ok, I guess this is going nowhere, so I found an alternative solution (in case another poor soul needs it) - see below.

Upon pod start I used the trick of Environment.SetEnvironmentVariable (Environment.GetEnvironmentVariable(key) |> get from key vault). So basically I am retrieving the connection string from key vault and then setting it as the same env var, before the env var is retrieved for example by WebJobs SDK which needs clear-text connection string in env var
Deleted all secret provider classes and secrets, and reduced the rotation frequency from 2m to 60m, as only traefik certs now need to be rotated

Basically we stopped using the csi driver in 99% of the cases, as it was used mainly for mapping secrets from key vault to env vars.

deyanp added the bug Something isn't working label Jun 30, 2021

github-actions bot added the stale label Aug 3, 2021

aramase removed the stale label Aug 3, 2021

github-actions bot added the stale label Aug 26, 2021

github-actions bot removed the stale label Aug 27, 2021

github-actions bot added the stale label Sep 10, 2021

github-actions bot removed the stale label Sep 11, 2021

deyanp closed this as completed Sep 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too many logs produced by container #564

Too many logs produced by container #564

deyanp commented Jun 30, 2021 •

edited

Loading

nilekhc commented Jun 30, 2021

deyanp commented Jun 30, 2021

nilekhc commented Jun 30, 2021

deyanp commented Jul 1, 2021

deyanp commented Jul 1, 2021

deyanp commented Jul 1, 2021

nilekhc commented Jul 1, 2021

deyanp commented Jul 1, 2021

deyanp commented Jul 4, 2021

aramase commented Jul 12, 2021

deyanp commented Jul 19, 2021

github-actions bot commented Aug 3, 2021

pierluigilenoci commented Aug 3, 2021

deyanp commented Aug 4, 2021

pierluigilenoci commented Aug 11, 2021

github-actions bot commented Aug 26, 2021

pierluigilenoci commented Aug 26, 2021

github-actions bot commented Sep 10, 2021

deyanp commented Sep 10, 2021

pierluigilenoci commented Sep 10, 2021

pierluigilenoci commented Sep 10, 2021

deyanp commented Sep 10, 2021

deyanp commented Sep 21, 2021

Too many logs produced by container #564

Too many logs produced by container #564

Comments

deyanp commented Jun 30, 2021 • edited Loading

nilekhc commented Jun 30, 2021

deyanp commented Jun 30, 2021

nilekhc commented Jun 30, 2021

deyanp commented Jul 1, 2021

deyanp commented Jul 1, 2021

deyanp commented Jul 1, 2021

nilekhc commented Jul 1, 2021

deyanp commented Jul 1, 2021

deyanp commented Jul 4, 2021

aramase commented Jul 12, 2021

deyanp commented Jul 19, 2021

github-actions bot commented Aug 3, 2021

pierluigilenoci commented Aug 3, 2021

deyanp commented Aug 4, 2021

pierluigilenoci commented Aug 11, 2021

github-actions bot commented Aug 26, 2021

pierluigilenoci commented Aug 26, 2021

github-actions bot commented Sep 10, 2021

deyanp commented Sep 10, 2021

pierluigilenoci commented Sep 10, 2021

pierluigilenoci commented Sep 10, 2021

deyanp commented Sep 10, 2021

deyanp commented Sep 21, 2021

deyanp commented Jun 30, 2021 •

edited

Loading