liveness: allow testing main heartbeat loop deterministically #107452
Labels
A-testing
Testing tools and infrastructure
C-cleanup
Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Liveness starts a goroutine that heartbeats the own liveness record periodically. This means that lots of tests that interact with liveness need to exert control over this goroutine. It would be easier to test NodeLiveness if the concurrency were externalized. In other words, rather than
NodeLiveness.Start
spawning a goroutine that loops and runs code that can only be reached by that goroutine, the goroutine should be a method onNodeLiveness
that can be invoked manually, andStart
should take a suitably defined abstraction over a "periodic runner". In prod, the periodic runner would be the familiar async task with a for loop. In testing, it may just be a no-op, and the harness can call the method directly whenever it wants to pretend the ping interval elapsed.It's a bit tricky to get this right but there is probably an abstraction here that applies similarly to many other subsystems in CRL that start auxiliary goroutines (which are then difficult to test against). We should architect to give tests as much control over concurrency as possible.
Jira issue: CRDB-30050
The text was updated successfully, but these errors were encountered: