Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bunch of "failed to retrieve adjtimex stats: operation not permitted" logs #1934

Closed
bygui86 opened this issue Jan 19, 2021 · 8 comments · Fixed by #1938
Closed

Bunch of "failed to retrieve adjtimex stats: operation not permitted" logs #1934

bygui86 opened this issue Jan 19, 2021 · 8 comments · Fixed by #1938

Comments

@bygui86
Copy link

bygui86 commented Jan 19, 2021

Host operating system: output of uname -a

Not able to SSH into the container, I suppose nobody as in YAML file is set

# ...
    spec:
      serviceAccountName: node-exporter
      hostPID: true
      hostNetwork: true
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
# ...

node_exporter version: output of node_exporter --version

v1.0.1

node_exporter command line flags

--web.listen-address=127.0.0.1:9100
--path.procfs=/host/proc
--path.sysfs=/host/sys
--path.rootfs=/host/root
--no-collector.wifi
--no-collector.hwmon
--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)

Are you running node_exporter in Docker?

Running in Kubernetes v1.17.14 using Containerd engine

What did you do that produced an error?

Nothing

What did you expect to see?

No errors like presented below

What did you see instead?

Bunch of errors like:

level=error ts=2021-01-19T18:13:58.692Z caller=collector.go:161 msg="collector failed" name=timex duration_seconds=2.0179e-05 err="failed to retrieve adjtimex stats: operation not permitted"
level=error ts=2021-01-19T18:16:58.692Z caller=collector.go:161 msg="collector failed" name=timex duration_seconds=0.000932371 err="failed to retrieve adjtimex stats: operation not permitted"
level=error ts=2021-01-19T18:17:13.692Z caller=collector.go:161 msg="collector failed" name=timex duration_seconds=0.000178464 err="failed to retrieve adjtimex stats: operation not permitted"

Thanks in advance for any help or tip :)

@SuperQ
Copy link
Member

SuperQ commented Jan 19, 2021

Sounds like an environment-specific security limitation. unix.Adjtimex() is not typically a privileged syscall for reads. I would check the security settings of the kernel / host os.

Best workaround is to disable the timex collector.

We should probably make this a debug-level error if we get an "operation not permitted" error.

@bygui86
Copy link
Author

bygui86 commented Jan 20, 2021

@SuperQ sorry I forgot to mention that Kubernetes is offered by GCP, so I'm not sure if I can check security settings of the kernel.

Can you please share the command to check those security settings?
Can you please share also the configuration to disable timex collector?

Thanks a lot for your help!

We should probably make this a debug-level error if we get an "operation not permitted" error.

Yeah I agree that would be a great idea, most probably for all running node-exporter on a cloud provided k8s.

@discordianfish
Copy link
Member

@bygui86 It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

@bygui86
Copy link
Author

bygui86 commented Jan 21, 2021

Sorry @discordianfish but this is not a question but a bug report, or at least a feature improvements.

SuperQ added a commit that referenced this issue Jan 23, 2021
Handle case where Adjtimex syscall gets a permission denined error.

Fixes: #1934

Signed-off-by: Ben Kochie <[email protected]>
@bygui86
Copy link
Author

bygui86 commented Jan 23, 2021

@SuperQ thanks for the MR! Amazing!
When are you going to release this improvement?

@SuperQ
Copy link
Member

SuperQ commented Jan 24, 2021

I've been putting off a release for a while. But I'm going to work on it today so we can have 1.1.0 in Debian Stable.

@discordianfish
Copy link
Member

@bygui86 Was meant as response to you other questions. Thanks for the report though!

@bygui86
Copy link
Author

bygui86 commented Jan 25, 2021

@discordianfish sorry I misunderstood!

oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this issue Apr 9, 2024
Handle case where Adjtimex syscall gets a permission denined error.

Fixes: prometheus#1934

Signed-off-by: Ben Kochie <[email protected]>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this issue Apr 9, 2024
Handle case where Adjtimex syscall gets a permission denined error.

Fixes: prometheus#1934

Signed-off-by: Ben Kochie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants