Docs

Install, configure, send telemetry, and operate a self-hosted Fanout.

Fanout is one binary you run yourself. It accepts OpenTelemetry over gRPC, stores the data on disks you control, and serves a UI, a chat investigator, an alert engine, and an MCP server — all from the same process.

This page is everything you need to get from zero to a working install, sending telemetry, and operating it day-to-day. Use On this page to jump around.

Install

Pick whichever path matches how you already deploy. Fanout is a single self-contained executable — about 30 MB, no runtime dependencies beyond a recent libc.

Docker

docker run -d --name fanout \
  -p 7520:7520 -p 4317:4317 \
  -v $PWD/data:/var/lib/fanout/data \
  ghcr.io/labstack/fanout:latest
Port / pathPurpose
7520HTTP — web UI, API, and the MCP endpoint.
4317OTLP gRPC ingest.
./dataPersistent storage — telemetry, application state, and saved reports.

The container listens on all interfaces by default. For a host-only install, add -e OTLP_GRPC_ADDR=127.0.0.1:4317.

Pre-built binary

Download the artifact for your platform from the releases page and run it:

./fanout

Defaults: HTTP on :7520, OTLP gRPC on 127.0.0.1:4317, data under ./data.

Sizing

Guidelines, not hard limits. The binary is small; the data is what consumes resources.

ResourceRecommended starting point
CPU2 vCPU
Memory1 GB (raise via DUCKDB_MEMORY for larger workloads)
Disk20 GB on fast local storage; budget ~1 GB / day per million spans at default retention

First boot

Fanout refuses to start without JWT secrets, SMTP credentials (for email login codes), and an LLM API key (for the chat investigator). Everything else has a default.

Minimum viable command

docker run -d --name fanout \
  -p 7520:7520 -p 4317:4317 \
  -v $PWD/data:/var/lib/fanout/data \
  -e JWT_SECRET=$(openssl rand -hex 32) \
  -e JWT_REFRESH_SECRET=$(openssl rand -hex 32) \
  -e SMTP_HOST=smtp.example.com \
  -e [email protected] \
  -e SMTP_PASS=<smtp-password> \
  -e SMTP_FROM='"Fanout" <[email protected]>' \
  -e AI_API_KEY=<anthropic-or-openai-key> \
  ghcr.io/labstack/fanout:latest

The JWT_* secrets must differ and each must be at least 32 characters. Generate fresh ones with openssl rand -hex 32.

Create the admin

On first boot Fanout logs a one-time setup token that authorises the admin-creation flow:

docker logs fanout 2>&1 | grep "setup token"

Open http://localhost:7520 and fill in the setup form with your name, email, and the token. Fanout creates the admin, signs you in, and prints the ingest token once — copy it now, it isn’t shown again. You can rotate it later from Settings → Ingest in the UI.

After this, the setup form is closed for the lifetime of the data directory. New users join via email invites; logins use one-time codes delivered via SMTP. No passwords are ever stored.

Send telemetry

Fanout speaks OTLP over gRPC on port 4317. Anything that can export OTLP — an SDK, a collector, a sidecar — will work without modification.

HTTP/protobuf and HTTP/JSON OTLP are not yet supported. If you need them, run an OpenTelemetry Collector in front and point its otlp exporter at Fanout.

Authentication

Every request must carry a valid ingest token. Two header forms are accepted, equivalently:

x-fanout-ingest-token: fo_<token>
Authorization: Bearer fo_<token>

A missing or invalid token returns Unauthenticated. The same token works for every signal type.

Direct from an SDK

export OTEL_EXPORTER_OTLP_ENDPOINT=https://fanout.example.com:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_HEADERS=x-fanout-ingest-token=fo_<token>
export OTEL_SERVICE_NAME=checkout

Use http:// if your endpoint isn’t TLS-terminated. The headers env var takes a comma-separated list of key=value pairs.

Through an OpenTelemetry Collector

If you already run a collector (recommended for production — buffering, batching, sampling, per-tenant routing), add Fanout as another otlp exporter:

exporters:
  otlp/fanout:
    endpoint: fanout.example.com:4317
    headers:
      x-fanout-ingest-token: fo_<token>

service:
  pipelines:
    traces:  { exporters: [otlp/fanout] }
    logs:    { exporters: [otlp/fanout] }
    metrics: { exporters: [otlp/fanout] }

You can fan out to Fanout and an existing backend during a migration — exporters are list-typed.

Multi-product / multi-tenant namespaces

If a single Fanout serves more than one product or environment, set service.namespace in your OpenTelemetry resource attributes:

export OTEL_RESOURCE_ATTRIBUTES=service.namespace=product-a,service.name=checkout

The UI’s namespace picker (top-right of the header) filters every view; MCP tools accept namespace as an explicit argument. Payloads without a service.namespace land in DEFAULT_NAMESPACE (default unless overridden).

TLS

Two options — pick one:

  • Behind a reverse proxy (recommended). Caddy, nginx, or Traefik terminates TLS for fanout.example.com:4317 and proxies plaintext gRPC to Fanout on 127.0.0.1:4317. Your collector or SDK only ever sees the proxy.
  • Direct termination. Set TLS_CERT_FILE and TLS_KEY_FILE. Both the HTTP and gRPC listeners use the same certificate. TLS 1.3 minimum.

Setting only one of the two TLS variables is a startup error — a guardrail to catch half-configured deployments.

Use the UI

Open http://<fanout-host>:7520 after signing in.

  • Home — health grid for every service in the current namespace. Incidents (unhealthy or degraded) surface at the top with inline context and an Investigate button that launches Chat with the service pre-scoped. Healthy services show traffic, p95, and error-rate numbers.
  • Service detail — latency and error-rate timeseries, top endpoints, example failing traces, dependencies. Every chart and row has its own Investigate shortcut.
  • Chat — full-page investigator. Ask in plain English; the assistant calls the MCP tools behind the scenes and renders charts, tables, and traces inline. Suggested prompts appear on the empty state.
  • Alerts — firing / pending / resolved list plus an inline editor for rules.
  • Settings (admin only) — rotate the ingest token.

The namespace picker in the top-right header filters every page. The New chat button (only on /chat) resets the conversation.

Alerts

Rules are written in expr-lang, evaluated every ALERT_EVAL_INTERVAL seconds (default 30), and delivered by webhook.

Anatomy of a rule

FieldDescription
nameShown on the Alerts page and in webhook payloads.
expressionAn expr-lang boolean evaluated per service per interval.
for_secondsHow long the expression must hold before the rule fires. 0 = fire immediately.
webhook_urlWhere to POST the alert payload.
webhook_headersExtra HTTP headers — typically auth.
webhook_templateOverride the default JSON payload.
notify_on_resolveSend a follow-up POST when the condition clears.

Available fields

Every rule has these fields in scope.

FieldTypeDescription
servicestringService being evaluated. Useful for service == "checkout".
error_ratefloatError rate in this window, 0.01.0.
p50 / p95floatLatency percentiles, milliseconds.
throughputfloatRequests per second over the window.
log_countfloatLog entries seen in the window.
z_scorefloatAnomaly score against the historical baseline.
health_scorefloatComposite score, lower is worse.
error_rate_delta / p95_delta / throughput_deltafloatPercentage change vs. baseline (e.g. 50 = +50%, -50 = halved).

Example rules

# Sustained error rate — ignore spikes.
name:        "error rate > 5% for 5 min"
expression:  error_rate > 0.05
for_seconds: 300

# Latency regression — sustained only.
name:        "p95 latency > 2s for 10 min"
expression:  p95 > 2000
for_seconds: 600

# Throughput collapse — ignores naturally low-traffic services.
name:        "sudden traffic drop"
expression:  throughput_delta < -50 && throughput > 10
for_seconds: 120

# Anomaly score — "something looks off".
name:        "anomaly: z-score > 3"
expression:  z_score > 3
for_seconds: 180

Lifecycle

A rule moves through three states:

  • Pending — the expression just became true. The engine waits out for_seconds.
  • Firing — the condition has held long enough. Webhooks deliver and a badge appears in the UI nav.
  • Resolved — the expression returned false. If notify_on_resolve is set, a final webhook fires.

Resolved alerts stay queryable for ALERT_HISTORY_DAYS (default 7) — visible in the UI and via the alerts MCP tool.

Webhook payload

A firing rule POSTs JSON to webhook_url:

{
  "rule": "error rate > 5% for 5 min",
  "service": "checkout",
  "namespace": "default",
  "fired_at": "2026-04-20T14:22:08Z",
  "expression": "error_rate > 0.05",
  "values": {
    "error_rate": 0.082,
    "p50": 94,
    "p95": 412,
    "throughput": 1180
  }
}

Override the shape with webhook_template if your downstream expects a different schema (PagerDuty, Slack, OpsGenie, etc.).

MCP server

Fanout ships an MCP (Model Context Protocol) server at /mcp. Connect Claude Code — or any MCP-capable assistant — and these tools become available for investigation. The same server backs the chat investigator inside the Fanout UI.

Connect Claude Code

# Production
claude mcp add fanout --transport http https://fanout.example.com/mcp

# Local
claude mcp add fanout --transport http http://localhost:7520/mcp

The MCP endpoint accepts an ingest token the same way as OTLP — pass Authorization: Bearer fo_<token> if your transport supports custom headers, or rely on session-based auth through a logged-in browser.

Tools

ToolWhat it does
overviewSystem health, scores, top issues.
topologyService dependency map with blast radius.
diagnoseDeep-dive on one service — latency, errors, saturation vs. baseline.
spansSearch and aggregate trace spans.
traceSingle distributed trace with root-cause analysis.
logsSearch and aggregate log entries.
metricsDiscover and query OTLP metric timeseries.
compareSide-by-side: two services, two time windows, or two operations.
attributesDiscover filterable attribute keys for spans, logs, or metrics.
alertsList firing, pending, or resolved alerts — filterable by service or rule.
alert_rulesManage alert rules — list, create, update, delete.
queryRaw SQL against the underlying data.

Claude (or whichever model you use) decides which tools to call. A typical incident loop looks like overview → topology → diagnose → trace → logs, but you don’t have to memorise the order.

Tokens that can ingest can also query — there’s no separate read/write split today. If you need stricter isolation, gate the endpoint at your reverse proxy.

Operate

Data layout

Everything Fanout persists lives under DATA_DIR (default ./data). That’s the only directory you need to back up, and the only one you need to move when relocating a host.

Backups

  1. Stop the process (docker stop fanout or systemctl stop fanout).
  2. Copy the whole DATA_DIR to your backup target.
  3. Start it back up.

Snapshotting a live directory can capture mid-flush state — safer to stop first. Flushes happen every FLUSH_SECONDS (default 15), so downtime for a backup is under a minute for most installs.

To restore on a new host: put the backup at the same DATA_DIR path and start Fanout. Ingest tokens, users, saved reports, and all telemetry come with it.

Upgrades

Pull the new image (or binary) and restart. Schema migrations apply automatically at boot.

docker pull ghcr.io/labstack/fanout:latest
docker stop fanout && docker rm fanout
# re-run your original `docker run` command

Downgrading across a migration is not supported — back up before upgrading if you need an escape hatch.

Troubleshooting

A few common failure modes and what to check first.

  • No services appear after sending telemetry. Confirm the token header reaches Fanout (some proxies strip custom headers), that the endpoint scheme is explicit (http:// or https://), and that port 4317 is reachable: nc -vz fanout.example.com 4317. Data takes up to FLUSH_SECONDS to appear — wait ~15 s before debugging.
  • Startup fails immediately. Check docker logs fanout. The most common cause is missing JWT_*, SMTP_*, or AI_API_KEY. Setting only one of the two TLS files is also fatal by design.
  • Login codes not arriving. Verify SMTP credentials and sender domain. Fanout uses STARTTLS on port 587 and 25, implicit TLS on 465.
  • Queries slow. First check the freshness — rollups update every ROLLUP_EVERY seconds (default 60). Raising DUCKDB_MEMORY can help larger working sets. For very long time ranges, expect raw scans to take longer than rollup-backed queries.

Environment reference

Fanout is configured entirely through environment variables. A .env file next to the binary is loaded first; .env.${ENV} overrides it (ENV defaults to development).

Network

VariableDefaultDescription
HTTP_ADDR:7520Web UI, API, and MCP endpoint listen address.
OTLP_GRPC_ADDR127.0.0.1:4317OTLP gRPC ingest address. The official Docker image overrides this to :4317 so off-host traffic is accepted.
DEFAULT_NAMESPACEdefaultNamespace assigned to OTLP payloads without service.namespace.

Storage

VariableDefaultDescription
DATA_DIR./dataStorage root for telemetry, query cache, and application state.
DUCKDB_MEMORY512MBIn-memory budget for the embedded query engine.
RETENTION_DAYS30Drop telemetry files older than N days. 0 keeps everything forever.

Ingest tuning

VariableDefaultDescription
FLUSH_SECONDS15How often pending rows are flushed to disk. Lower = fresher UI; higher = less I/O.
FLUSH_BATCH_SIZE50000Cap on rows per flush, regardless of interval.
ROLLUP_EVERY60How often per-minute rollups are recomputed.

Authentication (required)

VariableDescription
JWT_SECRETRequired. HS256 signing key for short-lived access tokens.
JWT_REFRESH_SECRETRequired. HS256 signing key for refresh tokens. Must differ from JWT_SECRET.

Each must be at least 32 characters. Generate with openssl rand -hex 32.

Email (required)

VariableDefaultDescription
SMTP_HOSTRequired. SMTP server hostname.
SMTP_PORT587465 uses implicit TLS; 587 and 25 use STARTTLS when offered.
SMTP_USERRequired. SMTP username.
SMTP_PASSRequired. SMTP password or API key.
SMTP_FROMRequired. From header — e.g. "Fanout" <[email protected]>.

AI provider (required)

VariableDefaultDescription
AI_PROVIDERanthropicanthropic or openai.
AI_API_KEYRequired. Provider API key.
AI_MODEL(provider default)Override the default model — e.g. claude-sonnet-4-6, gpt-4.1.
AI_BASE_URL(provider default)Override the API base URL — useful for proxies and gateways.

Alerts

VariableDefaultDescription
ALERT_ENABLEDtrueSet to false to disable the alert engine entirely.
ALERT_EVAL_INTERVAL30How often (seconds) rules are evaluated against fresh rollups.
ALERT_HISTORY_DAYS7How long resolved alerts stay queryable in the UI and via the alerts MCP tool.

MCP

VariableDefaultDescription
MCP_ENABLEDtrueExpose the MCP server at /mcp. Disable if you don’t want it reachable.

TLS

VariableDefaultDescription
TLS_CERT_FILEPath to the server certificate (PEM).
TLS_KEY_FILEPath to the server private key (PEM).

When both are set, HTTP_ADDR and OTLP_GRPC_ADDR listen with TLS 1.3. Setting only one is a startup error.