-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uptime calculation improvement and 1-year uptime #2750
Conversation
Just a wild suggestion: In theory maybe we can cache it such that each update for a particular window will only do 1 read and 1 write.
Haven't tried this and don't know if there are edge cases to handle tho. Edit: Oops I realized if the beat leaving the window has a longer duration than the current rate, it would lead to the same beat being subtracted multiple times... I guess if we can somehow handle these 2 cases it would work? |
Good point. I also recheck the current logic, it seems that it is also not handled the edge cases correctly. But I am not quite sure. I think for this part, maybe it is good to start with writing test cases first. uptime-kuma/server/model/monitor.js Lines 992 to 1020 in e241728
|
b404af6
to
7231763
Compare
The uptime calculation actually make me a bit headache, because when I tried to look into it, there are a lot of weird cases to be handled. I am rethinking the time-series database option, it seems that QuestDB is quite promising, because it could sum up a large set of data with a simple sql. If it is really gaining a lot of performance, it maybe an ultimate solution for #1740 too. But I don't know the ram usage, I will try to import 1,000,000 heartbeat records into QuestDB and test it. |
After some tests, I think QuestDB is really the way to go. For example, I try to sum up 30-day uptime: QuestDB ResultOracle Cloud free instance (2cores + 1GB RAM) The execution time: around 2ms - 7ms SQLite ResultMy notebook (11gen i7 8cores + 16GB RAM) I don't know how to use the sqlite command to display the execution time, so I ran it on my pc. The execution time: around 69ms - 75ms So even though the oracle cloud instance is weak, it is still faster than sqlite on my pc. |
Hello I find it strange that QuestDB is so energy intensive, because when I see their presentation Docker hub introduction --> QuestDB is an open-source database designed to make time-series lightning fast and easy Is it possible to test with QuestDB on external docker? What is the final strategy? have a TSDB + relational database?(external-mariadb OR mysqlite)? Otherwise, on my side, I find REDIS-TSDB / InfluxDB / QuestDB very good in addition to improving the graphic part |
I likely stick back to SQLite/MariaDB, as the setup is easier and it won't use a lot of RAM. I may look into the sliding window or rolling window algorithm later. |
By using PostgreSQL we could use timescaledb as an extension and activate that for some tables only, also I love Postgres more than MySQL🙂 |
Do you already have a timeframe for when this feature will be released? |
See #2720 (comment) |
I think most time-series database use too many memory, which is not ideal for Uptime Kuma (I hope Uptime Kuma could be relatively lightweight), so I go back to my original plan - aggregate tables. It is going well, getting a 1-year uptime is less than 1ms.
Also, I added support for native Node.js test runner along with this pr. |
The size of the database will get bigger and bigger soon I think 🤔 But query performance is great... |
I tried to make these tables as small as possible, they are all int and float. Also I will try to save some space by eliminating duplicate strings in #3595 |
I think only one record for each monitor in aggregated tables is enough. Can you tell me if we need them? |
For that we could use a custom |
Table size is a decent downside if performance is great. |
Big tables may make select queries slower... |
Isn't the whole point of a database to have big tables? |
is this WIP or already a feature? cause i cant find setting to change status page uptime to monthly etc in latest version |
There is no such feature available. |
Try to improve the uptime calculation performance by an aggregate table.
However the definition of uptime will be a little bit different:
Assume that the heartbeat interval is 20 seconds.
Before:
The current best/worst case should be 3 * 60 * 24 * 30 = 129,600 for 30-day uptime, which means it will sums up 129,600 numbers. This process will be triggered 3 times per minute for a monitor.
After:
The worst case of summation should be 1 year. It would be 3 * 60 * 24 + 364 = 4,684 numbers. The best case will be 29 (1-month) / 364 (1-year) numbers (at 00:00). It should be a lot faster.
Using aggregate table is actually suggested by ChatGPT, they also suggested time-series database. I do not consider this first.