Skip to content

State & Storage

This path is for teams moving beyond stateless demos.

  • where thread and run data should live
  • where runtime config, mailbox jobs, and profile/shared state should live
  • how state is keyed and merged
  • how much context should reach the model each turn
  • how to model parent–child threads when sub-agents create their own threads

Threads carry an optional parent_thread_id. The runtime sets it on a child thread the first time a sub-agent run materializes the thread, taking the value from RunActivationSnapshot.trace.parent_thread_id (or legacy RunRequestSnapshot.parent_thread_id). ThreadStore exposes list_child_threads, validate_thread_hierarchy, and delete_thread_with_strategy(reject | detach | cascade) so callers can pick a child-handling policy explicitly. The default Detach strategy preserves children with parent_thread_id cleared. The default delete_thread_with_strategy implementation is not atomic across child writes and the final delete; production stores with concurrent writers should override it. The file, PostgreSQL, and NATS-buffered backends ship native overrides.

Pagination: list_threads_query(&ThreadQuery) supports parent_filter (Any, Root, or Parent(parent_id)) and resource_id filters with cursor tokens that are validated against the original query shape on decode. list_message_records(thread_id, &MessageQuery) paginates messages with sequence-number windows, asc/desc ordering, visibility filters, and producing-run filters.

  1. Use File Store or Use Postgres Store to choose a persistence backend.
  2. State Keys and Thread Model to understand state layout and lifecycle.
  3. Optimize Context Window when context size starts to matter.

Current built-in stores cover memory, file, PostgreSQL, SQLite mailbox, and NATS JetStream. Use the smallest backend that covers the durability boundary you need:

CapabilityMemoryFilePostgreSQLSQLiteNATS
Thread/run projectionsyesyesyesnovia NatsBufferedThreadStore decorator
Managed configyesyesyesnono
Profile/shared stateyesyesnonono
Canonical eventsyesnoyesnono
Protocol replay logyesnoyesnono
Outbox/checkpoint repairyesnoyesnono
Stream checkpointsyesnoyesnono
Versioned registryyesyesyesnono
Mailbox jobsyesnonosingle-node durabledistributed durable

NatsBufferedThreadStore can wrap any thread/run backend to coalesce checkpoint writes through a JetStream WAL.

Awaken separates runtime execution state from the server control plane. Runtime development can use the in-process AgentRuntime with a commit coordinator and profile/shared state stores. Server development adds mailbox dispatch, canonical events, protocol replay, config versioning, audit, and eval/trace persistence around that runtime.

DataContractRuntime-only useServer use
Thread and run projectionsThreadRunStore plus CommitCoordinatorCheckpoint read/write boundary for AgentRuntimeSame projections, usually committed through a server staged coordinator
Pending user input and dispatch lifecycleMailboxStoreNot required unless the app builds its own queueDurable background runs, resume, cancel, interrupt, HITL, protocol delivery
Canonical eventsEventStoreNot required for basic in-process runsDurable event list/SSE resume and protocol replay
Outbox/staged idsStagedCommitCoordinator / ThreadCommitStagedOutcomeRuntime does not observe event/outbox idsServer/store implementations publish event and outbox ids after commit
Managed registry configConfigStore, ConfigRuntimeManagerOptional; code can build registries directly/v1/config/*, admin console edits, audit restore, hot publication
Admin auditAuditLogStoreOptionalRequired for version history, restore, and operator accountability
Profile/shared stateProfileStore, shared-state storeCross-run memory and learned priorsSame stores, usually shared by all served runs
Trace/eval datatrace store, eval storesOptional test/operator toolingAdmin trace views, trace-to-fixture curation, eval datasets/runs

The runtime commit outcome is intentionally narrow: ThreadCommitOutcome represents runtime commit success/failure only. Server-side implementations that need canonical event ids, server event ids, or outbox ids should use the server-contract staged outcome.

Mailbox jobs are run-dispatch control-plane records. They are separate from the thread/run checkpoint store, so a deployment can combine, for example, PostgreSQL thread storage with a NATS mailbox.

Mailbox dispatch status is a delivery lifecycle. Acked means the dispatch was accepted or consumed; execution success is represented by the related RunRecord.status, termination reason, and canonical events.

BackendUse whenBoundary
InMemoryMailboxStoreTests, local development, and embedded single-process runs.Process-local only; queued dispatches are lost when the process exits.
SqliteMailboxStoreA single-node server needs durable mailbox jobs without running NATS.Durable on local storage, but not the horizontally-scaled mailbox backend.
NatsMailboxStoreMultiple server instances need shared dispatch ownership, wakeups, and lease recovery.Requires JetStream and KV; all instances must share the same stream, buckets, and durable consumer.

See Use NATS Stores for distributed mailbox configuration and operations.