Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Uptime-Kuma on Kubernetes #4530

Closed
2 tasks done
wachtell opened this issue Feb 26, 2024 · 2 comments
Closed
2 tasks done

Running Uptime-Kuma on Kubernetes #4530

wachtell opened this issue Feb 26, 2024 · 2 comments
Labels
area:core issues describing changes to the core of uptime kuma help

Comments

@wachtell
Copy link

wachtell commented Feb 26, 2024

⚠️ Please verify that this question has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

📝 Describe your problem

I am sorry if this has been reported before. I am running Uptime Kuma on a Kubernetes cluster with 3 Servers and 8 agent nodes running on Ubuntu 22.04 with storage on Longhorn persistent volumes. I am getting a lot of timeout of 48000ms exceeded, getaddrinfo ENOTFOUND, Request failed with status code 520 even though the monitored site is up. I have changed the storage to bind-mount on node which helps some also indication that it is a Kubernetes issue.

Can you help me figuring out what I am doing wrong?

📝 Error Message(s) or Log

timeout of 48000ms exceeded
getaddrinfo ENOTFOUND
Request failed with status code 520

🐻 Uptime-Kuma Version

1.23.11

💻 Operating System and Arch

Ubuntu 22.04

🌐 Browser

Firefox 123.0

🖥️ Deployment Environment

Kubernetes Version: v1.27.10 +rke2r1

@wachtell wachtell added the help label Feb 26, 2024
@CommanderStorm CommanderStorm added the area:core issues describing changes to the core of uptime kuma label Feb 26, 2024
@CommanderStorm
Copy link
Collaborator

Can you help me figuring out what I am doing wrong?

Longhorn uses iscaci NFS under the hood, as I understand it.
=> uptime-kuma contains a database
=> you are running a database on a network share
=> possibly the added latency of reads/writes is killing the database performance and not #3515 or

Note that running on a NFS-Style system has soundness bugs with SQLite databases due to faulty file locking, which may lead to corrupted databases.
Please run uptime-kuma on a local volume instead.
See https://github.com/louislam/uptime-kuma/wiki/%F0%9F%94%A7-How-to-Install#-docker and https://www.sqlite.org/howtocorrupt.html#_filesystems_with_broken_or_missing_lock_implementations

I am running Uptime Kuma on a Kubernetes cluster with 3 Servers

HA will not work with uptime-kuma. Please don't run multiple instances of the same docker container as this may corrupt the database.

V2 includes a version to connect to external databases (or continue with the embedded mariadb/sqlite)
See #4500

In the meantime, choose a lower retention to mask this issue.

@wachtell
Copy link
Author

@CommanderStorm Thank you for your fast and insightful comments! It is really helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core issues describing changes to the core of uptime kuma help
Projects
None yet
Development

No branches or pull requests

2 participants