Skip to content

Deploy to Production

This guide ties together the pieces you need to run awaken-server in production. It assumes you already have an agent runtime wired into a server (see Serve & Integrate).

Compile a release binary (or container image) with only the cargo features your deployment uses — durable stores are feature-gated:

Terminal window
cargo build --release -p your-server-crate \
--features "server,postgres,observability,a2a"
FeatureEnables
postgresPostgresStore + PgCommitCoordinator (durable config/thread/run)
natsNATS-backed mailbox + buffered thread store for multi-replica
fileFileStore — fine for a single node, not for HA
observability / otelruntime stats, Prometheus, OpenTelemetry export
permissiontool-permission HITL
a2aAgent-to-Agent backend + routes
VariablePurpose
AWAKEN_HTTP_ADDRBind address. Production: 0.0.0.0:<port> behind a proxy (the ServerConfig default is 0.0.0.0:3000).
AWAKEN_ADMIN_API_BEARER_TOKENBearer token that protects every config/admin route. Required in production.
AWAKEN_ADMIN_CORS_ALLOWED_ORIGINSComma-separated origins allowed to call the admin API from a browser console.
AWAKEN_STORAGE_DIRStorage dir for file-based stores (dev/single-node).
AWAKEN_SEED_PROFILEBuiltin seed (minimal / demo). Usually unset in prod once you manage config yourself.
AWAKEN_EXPOSE_TRACE_ROUTESExposes trace-read routes. Traces contain prompts/tool args — only expose behind auth.

Provider credentials are config, not code: set api_key on the provider, or run keyless with adapter_options.allow_env_credentials = true and the adapter’s env var (e.g. VERTEX_API_KEY). See Provider & Model Config.

A dev binary may run on in-memory or file stores; production should not.

  • Config / threads / runs → Postgres. Wire PostgresStore and a PgCommitCoordinator (all durable runtime writes go through the commit coordinator). See Use Postgres Store.
  • Multi-replica dispatch → NATS mailbox + buffered thread store. See Use NATS Stores.
  • The file/in-memory dev coordinator is refused unless AWAKEN_ALLOW_DEV_FILE_COORDINATOR is set — do not set it in production.
  • Inject the admin token and provider credentials from your platform’s secret manager as env vars; never bake them into the image or commit them.
  • Prefer keyless providers (allow_env_credentials) where the upstream supports short-lived env tokens, so no long-lived key sits in config.
  • Rotate the admin bearer token and provider keys on a schedule.

Run the server on a private interface and put a reverse proxy (nginx, Caddy, Envoy, or your cloud LB) in front to terminate TLS and forward to AWAKEN_HTTP_ADDR. The server speaks plain HTTP/SSE; keep it off the public internet directly. Ensure the proxy does not buffer SSE responses (disable proxy buffering on the /v1/** streaming routes) or live token streaming will stall.

EndpointUse
GET /health/liveLiveness — always 200 while the process is up.
GET /healthReadiness — gates traffic on critical dependencies.
GET /metricsPrometheus scrape (with the observability feature).

Point your orchestrator’s livenessProbe at /health/live and readinessProbe at /health.

Enable the observability plugin and an exporter (Prometheus or OTel) so runtime stats, latency, and tool/inference metrics are visible — and so the Admin Console’s per-agent stats render. See Enable Observability.

  • Keep AWAKEN_ADMIN_API_BEARER_TOKEN set and scope CORS to your console origin.
  • Serve the Admin Console behind your edge auth; it is a browser client of the same admin API.
  • Treat the audit log as your change record — wire it in (retention configured separately) so every config write is attributable.
  • Release build with only the needed features
  • Postgres (and NATS for multi-replica) wired; dev coordinator off
  • Admin bearer token + scoped CORS set from secrets
  • Provider credentials injected from secrets (or keyless env)
  • TLS terminated at a proxy; SSE buffering disabled
  • Liveness /health/live + readiness /health probes wired
  • Observability exporter enabled; audit log on