Skip to content
This repository has been archived by the owner on Nov 17, 2020. It is now read-only.

Expose key individual health checks via HTTP API #844

Closed
michaelklishin opened this issue Sep 9, 2020 · 0 comments · Fixed by #854
Closed

Expose key individual health checks via HTTP API #844

michaelklishin opened this issue Sep 9, 2020 · 0 comments · Fixed by #854
Assignees
Milestone

Comments

@michaelklishin
Copy link
Member

michaelklishin commented Sep 9, 2020

Team RabbitMQ's approach to health checks has changed in the last year. There is now a group of simple, focussed, composable health checks provided by CLI tools. They are yet to be exposed via the HTTP API, which currently only offers the original One True Health Check™ which has a number of well known downsides:

  • It is complex, as it checks for N things at once
  • Users do not really understand what it does
  • It is very intrusive: it forces every channel and queue primary replica to emit some stats
  • As a result, it is very likely to introduce false positives under heavy load

Given these changes in CLI tools, it makes sense to

  • Deprecate GET /api/health/checks/node and GET /api/health/checks/node/{node} endpoints (their respective CLI commands are already deprecated)
  • Introduce a number of focussed health check endpoints

The endpoint tentatively looks like this:

GET /api/health/checks/{check}

so, for example

  • GET /api/health/checks/alarms
  • GET /api/health/checks/local-alarms
  • GET /api/health/checks/certificate-expiration
  • GET /api/health/checks/port-listener
  • GET /api/health/checks/protocol-listener
  • GET /api/health/checks/virtual-hosts
  • GET /api/health/checks/node-is-mirror-sync-critical
  • GET /api/health/checks/node-is-quorum-critical

Note that the checks will be executed on the local node. There is no option to run checks on a remote node as this feature makes little sense. CLI checks also work on the contacted node only.

This excludes the port_connectivity check since it makes no sense to run it on the node itself; CLI tools run it on the host where they are used, and HTTP API clients cannot do the same.

References #840.

@michaelklishin michaelklishin self-assigned this Sep 9, 2020
@michaelklishin michaelklishin added this to the 3.8.10 milestone Sep 9, 2020
michaelklishin added a commit to rabbitmq/rabbitmq-server that referenced this issue Oct 5, 2020
michaelklishin added a commit to rabbitmq/rabbitmq-ct-helpers that referenced this issue Oct 6, 2020
michaelklishin added a commit to rabbitmq/rabbitmq-ct-helpers that referenced this issue Oct 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants