Prometheus: Infrastructure monitoring ๐Ÿ“Š

Warning system about problems before they occur.

What controls:

  • ๐Ÿ–ฅ๏ธ Server metrics: CPU, RAM, disk, network via node_exporter
  • ๐ŸŒ Service availability: blackbox-HTTP/TCP/ICMP checks
  • ๐Ÿ“ˆ Collecting metrics from applications: Nextcloud, Home Assistant, and others
  • ๐Ÿšจ Alertmanager: notifications in Telegram/Discord when thresholds are exceeded
  • ๐Ÿ” Powerful queries via PromQL for deep analysis

How it works:

  1. Prometheus collects metrics on a schedule (scrap)
  2. Do you see graphs in Grafana or set up your dashboards
  3. In case of an anomaly, an alert is triggered and the administrator receives a notification.

For administrators: Flexible alert rules, recording rules, federation of metrics, long-term storage via Thanos.

Access: via Grafana (grafana.potatoenergy.ru) โ€ข according to Potato Energy credentials (management is only based on the rights of the admin group)