Metrics & health
Kestrel exposes three operational endpoints. None require an admin cookie; one requires a bearer token.
Liveness — /healthz
Returns 200 ok whenever the process is running, regardless of whether the DB is reachable. Use this for "is the binary alive" checks (Docker HEALTHCHECK directive, Kubernetes livenessProbe).
curl http://localhost:8080/healthz
# okReadiness — /healthz/ready
Returns 200 ready when the process is alive and the SQLite connection answers a SELECT 1 within 2 seconds. Returns 503 not ready otherwise. Use this for load-balancer health checks and Kubernetes readinessProbe.
curl http://localhost:8080/healthz/ready
# readyMetrics — /metrics
Prometheus text-format. Disabled by default — to enable, set a bearer token via env or config:
KESTREL_METRICS_TOKEN=$(openssl rand -hex 32)Or in kestrel.toml:
[metrics]
token = "..."When the token is unset, /metrics returns 404. When set, scrapers must present the token:
curl -H "Authorization: Bearer $TOKEN" http://localhost:8080/metricsExposed series
| Metric | Type | Description |
|---|---|---|
kestrel_info | gauge | Always 1; labels version, commit for build ID. |
kestrel_uptime_seconds | gauge | Seconds since process start. |
kestrel_projects_total | gauge | Number of projects. |
kestrel_issues_total | gauge | Issues, partitioned by status label. |
kestrel_events_total | gauge | Stored event payloads (subject to retention). |
kestrel_events_24h | gauge | Events received in the last 24 hours. |
Prometheus scrape config
scrape_configs:
- job_name: kestrel
metrics_path: /metrics
authorization:
type: Bearer
credentials: "<your-token>"
static_configs:
- targets: ["kestrel.example.com:8080"]Why is metrics opt-in?
The simpler alternative — anonymous /metrics — would let any internet visitor enumerate "this Kestrel install has 47k events and 12 projects". Kestrel's threat model assumes the host is on the public internet, so the default has to be safe-by-omission.
If your scraper lives on the same private network as Kestrel, the bearer token is just operational hygiene. If it lives elsewhere, the token is doing real work. Either way: easier to flip on than to tear out a leak after the fact.
What's not a metric
Alerting rules, anomaly detection, error budgets, SLO burn rates — none of that is in scope. The /metrics endpoint is a heart-rate monitor, not a pager. Kestrel's stance is that the AI agent attached via MCP is the alerting channel — it sees new errors as they land and can decide whether to wake you up. Pages from a Prometheus rule firing on kestrel_events_24h > 1000 would be redundant with that.