-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error spam on AKS #33697
Fix error spam on AKS #33697
Conversation
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
This pull request is now in conflicts. Could you fix it? 🙏
|
This was happening due to the error level logging when the log path matcher detected a `log.file.path` that does not start with a standard Docker container log folder `/var/lib/docker/containers` because AKS dropped support for Docker in September 2022 and switched to containerd. It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in elastic#16866 most likely because the message started with `Error` and looked confusing. This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change.
b80faa5
to
18ed701
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good (subject to the lint errors)
This was happening due to the error level logging when the log path matcher detected a `log.file.path` that does not start with a standard Docker container log folder `/var/lib/docker/containers` because AKS dropped support for Docker in September 2022 and switched to containerd. It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in #16866 most likely because the message started with `Error` and looked confusing. This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change. (cherry picked from commit 29f0b4c)
This was happening due to the error level logging when the log path matcher detected a `log.file.path` that does not start with a standard Docker container log folder `/var/lib/docker/containers` because AKS dropped support for Docker in September 2022 and switched to containerd. It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in #16866 most likely because the message started with `Error` and looked confusing. This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change. (cherry picked from commit 29f0b4c)
This was happening due to the error level logging when the log path matcher detected a `log.file.path` that does not start with a standard Docker container log folder `/var/lib/docker/containers` because AKS dropped support for Docker in September 2022 and switched to containerd. It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in #16866 most likely because the message started with `Error` and looked confusing. This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change. (cherry picked from commit 29f0b4c) Co-authored-by: Denis <[email protected]>
This was happening due to the error level logging when the log path matcher detected a `log.file.path` that does not start with a standard Docker container log folder `/var/lib/docker/containers` because AKS dropped support for Docker in September 2022 and switched to containerd. It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in #16866 most likely because the message started with `Error` and looked confusing. This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change. (cherry picked from commit 29f0b4c) Co-authored-by: Denis <[email protected]>
@rdner I think you can repro this in any kubernetes setup (not just AKS) which uses CRI-O If you do: kubectl get nodes -o wide if you see something like:
where the latest column says |
This was happening due to the error level logging when the log path matcher detected a `log.file.path` that does not start with a standard Docker container log folder `/var/lib/docker/containers` because AKS dropped support for Docker in September 2022 and switched to containerd. It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in #16866 most likely because the message started with `Error` and looked confusing. This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change.
This was happening due to the error level logging when the log path matcher detected a
log.file.path
that does not start with a standard Docker container log folder/var/lib/docker/containers
because AKS dropped support for Docker in September 2022 and switched to containerd.It looks like this message was not supposed to be on the error level in the first place since it just means that the matcher didn't match and it's not an error. But it was mistakenly promoted from the debug level in #16866 most likely because the message started with
Error
and looked confusing.This a partial fix to unblock our customers, but we still need to come up with the full AKS/containerd support in a follow up change.
Why is it important?
There is a lot feedback from customers that Elastic Agent becomes unhealthy because of the overwhelming error messages from Filebeat due to the issue.
Checklist
- [ ] I have commented my code, particularly in hard-to-understand areas- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues