Monitoring stack

This proof of concept aims to analyze the options and requirements to monitor distributed applications and their interfaces. Utilizing the open-source observability stack of Grafana Labs and other vendors a layered container stack is implemented.

Warning

This is a proof of concept and far from a production ready setup! Please visit the respective documentation of each component for a secure and reliable deployment.

The setup

Prerequisites

Docker
Docker compose v2
Grafana Loki's Docker plugin - if need be
this repository

Components

Loki - a Log aggregation system
Prometheus - a systems monitoring and alerting toolkit
Grafana - a visualization and observability platform
MinIO - s3 compatible object storage
cAdvisor - resource and performance characteristics of running containers

Planned/started extensions

Mimir - long term storage for Prometheus
- successfully integrated: compose-mimir.yml
OnCall - easy-to-use on-call management tool
- successfully integrated compose-oncall.yml
- use case pending

Architecture

The applications are separated into three layers:

infrastructure
- Storage
- Log aggregation
metrics
- Metric/log collection
observability
- Visualization
- TimeSeries Database (for custom metrics)

Every layer has it's own compose files, but all of them share the same bridge network monitoring-network. To pass configuration files, some components have their own directory, containing a config file e.g.:

.
├── loki
│   └── loki-config.yml
├── prometheus
│   └── prometheus-config.yml
├── compose-infra.yml
├── compose-observe.yml
.

The Logging Stack

To aggregate the logs of our monitoring stack we use Grafana's Loki. The initial configuration was inspired by.

For a single node deployment the referenced configuration should be suitable. But aggregating multiple applications, nodes and systems a deployment like loki/getting-started or loki/production should be considered.

Therefore, the Grafana's production template, containing separate read and write instances, a nginx gateway and a MinIO storage instance, was utilized in this setup.

Sending logs with Promtail

Using the recommended client to send logs, avoids configuration differences and generalizes the uses cases. Furthermore, a comparison with a custom implementation is possible

See this gist and the associated article for information on scraping docker logs with Promtail.

Changing the logging driver for a container

There are three methods to pass the logs of containerized applications save to Loki:

Note

To avoid unexpected behavior or losing logs, we don't want to modify the default behavior and integrate the settings in our compose files.

Collecting metrics

Prometheus scraps metrics from predefined targets and stores them in it's time-series database. Some applications like Grafana, MinIO or cAdvisor implement their own metrics endpoint, for others a custom endpoint may be developed (see client libraries for additional information).

Monitor a docker host and it's running containers the following steps are necessary:

expose a metrics endpoint on the docker host to be scraped by Prometheus
add cAdvisor alongside our setup, which provides scrapable metrics per container

Docker host

As described in Docker docs, we modify the current .../.docker/deamon.json to expose a metrics endpoint:

{
 "builder": { "gc": { "defaultKeepStorage": "20GB", "enabled": true } },
 "experimental": false,
 "features": { "buildkit": true }
 
}

becomes:

{
 "builder": { "gc": { "defaultKeepStorage": "20GB", "enabled": true } },
 "features": { "buildkit": true },
 "metrics-addr" : "127.0.0.1:9323",
 "experimental" : true
}

Container

See the official Prometheus documentation to monitor containers.

Visualizing and Observability

Building a custom Grafana image

For a setup behind a company proxy grafana requires additional certificates. Updating the certificates allows downloading plugins and dashboards via the GUI.

Requirements:

a certificate (.crt/.pem) or a certificate bundle (.pem)
build the custom image by providing the certificate as build argument:

Warning

Uploading a custom image to a public registry exposes your private certificates!

docker build . -t <image-name>:<tag> --build-arg CERTIFICATE_FILE=<path-to-certificate>

e.g.

docker build . -t grafana-custom --build-arg CERTIFICATE_FILE=certificate-bundle.pem

Start a root terminal session with a running container

docker compose exec -it -u 0 <service-name> bash

e.g.

docker compose exec -it -u 0 grafana bash

Putting it all together

Note

Make sure you have all Prerequisites!

Pull and initialize the application stack:

.\up.ps1

Visit the user interfaces:

check the created buckets in MinIO
monitor your running containers with cAdvisor
verify all scraping targets are up and running Prometheus UI
visit Grafana login with default credentials: username admin password admin

Shutdown the stack:

.\shutdown.ps1

Restart the setup:

.\startup.ps1

Bring all containers down, make sure they are stopped and remove them as well as the created bridge network afterwards.

setup the data sources in grafana:

pay attention to the auth of loki: custom header key: "X-Scope-OrgID" val:1 see: https://github.com/grafana/loki/blob/main/production/docker/config/datasources.yaml

.\down.ps1

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
docs		docs
grafana		grafana
grafana_provisioning		grafana_provisioning
loki-nginx		loki-nginx
loki		loki
mimir-nginx		mimir-nginx
mimir		mimir
prometheus		prometheus
promtail		promtail
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose-infra.yml		compose-infra.yml
compose-metrics.yml		compose-metrics.yml
compose-mimir.yml		compose-mimir.yml
compose-minio-admin.yaml		compose-minio-admin.yaml
compose-observe.yml		compose-observe.yml
compose-oncall.yml		compose-oncall.yml
down.ps1		down.ps1
minio_cli.ps1		minio_cli.ps1
renovate.json		renovate.json
shutdown.ps1		shutdown.ps1
startup.ps1		startup.ps1
up.ps1		up.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monitoring stack

The setup

Prerequisites

Components

Planned/started extensions

Architecture

The Logging Stack

Sending logs with Promtail

Changing the logging driver for a container

Collecting metrics

Docker host

Container

Visualizing and Observability

Building a custom Grafana image

Start a root terminal session with a running container

Putting it all together

About

Releases

Packages

Contributors 2

Languages

License

mj0nez/monitoring-stack

Folders and files

Latest commit

History

Repository files navigation

Monitoring stack

The setup

Prerequisites

Components

Planned/started extensions

Architecture

The Logging Stack

Sending logs with Promtail

Changing the logging driver for a container

Collecting metrics

Docker host

Container

Visualizing and Observability

Building a custom Grafana image

Start a root terminal session with a running container

Putting it all together

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages