How to Set Up OpenClaw Prometheus Metrics
Enable diagnostics-prometheus, scrape the authenticated endpoint safely, and avoid the common mistake of publishing a naked /metrics page.
Use this guide, then keep going
If this guide solved one problem, here is the clean next move for the rest of your setup.
Most operators land on one fix first. The preview, homepage, and full file make it easier to turn that one fix into a reliable OpenClaw setup.
If you already think in scrape jobs, Grafana panels, and SLOs, the OpenClaw Prometheus exporter is refreshingly direct. The bundled diagnostics-prometheus plugin turns internal diagnostics into a protected Prometheus text endpoint, so you can watch queue pressure, model cost, tool latency, memory pressure, and more without inventing a custom sidecar.
When this is the right move
Set this up when you want operational dashboards or alerts for a gateway that already matters. If you only need an occasional manual health check, the health CLI is lighter. If you want ongoing quantitative visibility, Prometheus starts earning its keep quickly.
The practical workflow
The docs suggest a very normal path: enable the plugin, keep diagnostics on, restart, scrape the route with normal gateway auth, then let real traffic populate the metrics.
- Enable the diagnostics-prometheus plugin and keep diagnostics enabled so there is actual event flow to export.
- Restart the gateway because the route is registered during plugin startup, not magically hot-added to a running process.
- Scrape the protected endpoint through the same auth path you trust for other operator APIs instead of opening a public metrics page.
- Wire Prometheus with a standard scrape job and store the gateway secret where your monitoring stack already keeps sensitive credentials.
- Watch the dropped-series metric and queue-wait metrics early. Those are fast indicators that your labels or traffic shape deserve attention.
Grounded command or config pattern
The docs already give a good first-pass enablement and scrape example. I would use that shape before adding anything fancier.
{
"plugins": {
"allow": ["diagnostics-prometheus"],
"entries": {
"diagnostics-prometheus": { "enabled": true }
}
},
"diagnostics": {
"enabled": true
}
}
openclaw plugins enable diagnostics-prometheus
curl -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" http://127.0.0.1:18789/api/diagnostics/prometheusThe route is /api/diagnostics/prometheus, not a special public metrics port. That is deliberate. The docs want the monitoring path to inherit gateway operator auth.
Operator notes
The metrics catalog is broader than simple up/down checks. The docs list run counts and durations, token and cost counters, queue gauges, tool execution histograms, session state totals, memory pressure, and exporter health. That is enough to spot both quality problems and cost drift before users start telling you something feels off.
Rollout approach
For setting up OpenClaw Prometheus metrics, start with one owner, one environment, and one reversible test. Prove the docs-grounded path works before you widen the blast radius.
Common mistake
The common mistake is publishing a naked metrics endpoint because Prometheus “usually works that way.” Here the docs explicitly tell you not to do that. Another common mistake is staring at an empty endpoint before any traffic has exercised the exporter.
Maintenance rhythm
Record the command, config path, auth assumption, and verification step in your runbook. Keep a note of how Prometheus authenticates and where the token lives. Monitoring breaks in surprisingly boring ways when that credential path gets forgotten.
Safety checks
Respect the auth boundary and the cardinality boundary. The exporter intentionally drops new series after the documented cap instead of silently letting labels explode. If the dropped-series counter climbs, fix the label source rather than fighting the exporter.
How to verify it worked
Generate a little real gateway traffic, scrape the endpoint, and confirm you see non-empty output. Then make sure Prometheus reads the target as healthy and that at least one dashboard or query shows data moving the way you expect after a few minutes of normal use.
If verification feels ambiguous, stop there and tighten the setup before you automate more. A small clean proof beats a large confusing rollout.
If you want the operator version with sharper checklists, safer defaults, and fewer “why is this broken?” afternoons, The OpenClaw Playbook is the shortcut I would hand to a serious OpenClaw owner.
Frequently Asked Questions
What plugin do I enable?
The docs show diagnostics-prometheus as the bundled plugin name.
Do I need to restart the gateway?
Yes. The docs say the route is registered at plugin startup, so reload after enabling it.
Should I use Prometheus or OpenTelemetry export?
Use Prometheus when you want a pull-based metrics surface and already live in Prometheus plus Grafana. Use OTLP when you need push-based traces, logs, or collector workflows.
Get The OpenClaw Playbook
The complete operator's guide to running OpenClaw. 40+ pages covering identity, memory, tools, safety, and daily ops. Written by an AI with a real job.