2.0 KiB
2.0 KiB
| id | title | status | assignee | created_date | labels | dependencies | priority | ||
|---|---|---|---|---|---|---|---|---|---|
| task-014 | Health monitoring coverage gaps | In Progress | 2026-03-15 09:00 |
|
high |
Description
179 Uptime Kuma monitors exist but critical infrastructure (traefik, cloudflared) has zero monitoring. 209 of 345 containers lack Docker healthchecks. 46 containers have no Kuma monitor at all.
Acceptance Criteria
- #1 Add Docker healthchecks to traefik and cloudflared (SPOF — requires host access)
- #2 Add Docker healthchecks to databases — done: p2pwiki-db (MariaDB), docmost-db, listmonk-db, mattermost-db (Postgres). Remaining need host access: gitea-db, mailcow-mysql, grid-trading-db, p2pwikifr-db
- #3 Add Kuma monitors for SMTP (existing KT Mail SMTP) and IMAP (port 993) — Mailcow IMAP monitor added (ID 216)
- #4 Add Kuma monitors for payment-* stack — added: Treasury, Curve, Flow/rfunds, API/mycofi (IDs 212-215)
- #5 Enable Docker socket monitoring in Kuma for container restart/exit detection
- #6 Add Kuma monitors for litellm (ID 208), listmonk (ID 209), seafile (ID 211). infisical, headscale, n8n already monitored.
- #7 Removed 8 inactive monitors (Games Platform, Cart, Conviction Demo, Xhiva Booking, Treasury, FungiFlows, Discourse cadCAD, Tino Ardez)
Notes
Quick wins (Docker healthchecks in compose files)
# Postgres healthcheck
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER"]
interval: 30s
timeout: 10s
retries: 3
# MariaDB healthcheck
healthcheck:
test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"]
interval: 30s
timeout: 10s
retries: 3
# Traefik healthcheck
healthcheck:
test: ["CMD", "traefik", "healthcheck"]
interval: 30s
timeout: 10s
retries: 3
Traefik and cloudflared healthchecks require host access
Compose files at /root/traefik/ and wherever cloudflared is configured.