Introduction

Awaken is a modular AI agent runtime framework built in Rust. It provides phase-based execution with snapshot isolation and deterministic replay, a typed state engine with key scoping (thread / run) and merge strategies (exclusive / commutative), a plugin lifecycle system for extensibility, and a multi-protocol server surface supporting AI SDK v6, AG-UI, A2A, and MCP over HTTP and stdio, plus ACP over stdio.

Crate Overview

Crate	Description
`awaken-contract`	Core contracts: types, traits, state model, agent specs
`awaken-runtime`	Execution engine: phase loop, plugin system, agent loop, builder
`awaken-server`	HTTP/SSE gateway with protocol adapters
`awaken-stores`	Storage backends: memory, file, postgres
`awaken-tool-pattern`	Glob/regex tool matching for permission and reminder rules
`awaken-ext-permission`	Permission plugin with allow/deny/ask policies
`awaken-ext-observability`	OpenTelemetry-based LLM and tool call tracing
`awaken-ext-mcp`	Model Context Protocol client integration
`awaken-ext-skills`	Skill package discovery and activation
`awaken-ext-reminder`	Declarative reminder rules triggered after tool execution
`awaken-ext-generative-ui`	Declarative UI components (A2UI protocol)
`awaken-ext-deferred-tools`	Deferred tool loading with probabilistic promotion
`awaken`	Facade crate that re-exports core modules

Architecture

Application code
  registers tools / models / providers / plugins / agent specs
        |
        v
AgentRuntime
  resolves AgentSpec -> ResolvedAgent
  builds ExecutionEnv from plugins
  runs the phase loop and exposes cancel / decision control
        |
        v
Server + storage surfaces
  HTTP routes, SSE replay, mailbox, protocol adapters, thread/run persistence

Core Principle

All state access follows snapshot isolation. Phase hooks see an immutable snapshot; mutations are collected in a MutationBatch and applied atomically after convergence.

What’s in This Book

Get Started — build a working mental model with the smallest runnable flows
Build Agents — add tools, plugins, MCP, skills, reminders, handoff, and UI capabilities
Serve & Integrate — expose HTTP endpoints and wire AI SDK or CopilotKit frontends
State & Storage — choose persistence, context shaping, and state lookup patterns
Operate — harden runtime behavior with observability, permissions, progress reporting, and tests
Reference — API, protocol, config, and schema lookup pages
Architecture — runtime layering, phase execution, and design tradeoffs

Repository Map

These paths matter most when you move from docs into code:

Path	Purpose
`crates/awaken-contract/`	Core contracts: tools, events, state interfaces
`crates/awaken-runtime/`	Agent runtime: execution engine, plugins, builder
`crates/awaken-server/`	HTTP/SSE server surfaces
`crates/awaken-stores/`	Storage backends
`crates/awaken/examples/`	Small runtime examples
`examples/src/`	Full-stack server examples
`docs/book/src/`	This documentation source

Get Started

Use this path if you are new to Awaken and want a working mental model before wiring production features.

Read in order

First Agent for the smallest runnable runtime.
First Tool to understand tool schemas, execution, and state writes.
Build an Agent when you want a reusable project baseline.
Tool Trait before writing production tools.

Leave this path when

You need more agent capabilities: go to Build Agents.
You need HTTP or frontend integration: go to Serve & Integrate.
You need persistence or operational controls: go to State & Storage or Operate.

First Agent

Goal

Run one agent end-to-end and confirm you receive a complete event stream.

Prerequisites

[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"

Set one model provider key before running:

# OpenAI-compatible models (for gpt-4o-mini)
export OPENAI_API_KEY=<your-key>

# Or DeepSeek models
export DEEPSEEK_API_KEY=<your-key>

1. Create `src/main.rs`

use std::sync::Arc;
use serde_json::{json, Value};
use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::contract::message::Message;
use awaken::contract::event::AgentEvent;
use awaken::contract::event_sink::VecEventSink;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::AgentSpec;
use awaken::registry::ModelEntry;
use awaken::{AgentRuntimeBuilder, RunRequest};

struct EchoTool;

#[async_trait]
impl Tool for EchoTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("echo", "Echo", "Echo input back to the caller")
            .with_parameters(json!({
                "type": "object",
                "properties": { "text": { "type": "string" } },
                "required": ["text"]
            }))
    }

    async fn execute(
        &self,
        args: Value,
        _ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        let text = args["text"].as_str().unwrap_or_default();
        Ok(ToolResult::success("echo", json!({ "echoed": text })).into())
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let agent_spec = AgentSpec::new("assistant")
        .with_model("gpt-4o-mini")
        .with_system_prompt("You are a helpful assistant. Use the echo tool when asked.")
        .with_max_rounds(5);

    let runtime = AgentRuntimeBuilder::new()
        .with_agent_spec(agent_spec)
        .with_tool("echo", Arc::new(EchoTool))
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model("gpt-4o-mini", ModelEntry {
            provider: "openai".into(),
            model_name: "gpt-4o-mini".into(),
        })
        .build()?;

    let request = RunRequest::new(
        "thread-1",
        vec![Message::user("Say hello using the echo tool")],
    )
    .with_agent_id("assistant");

    let sink = Arc::new(VecEventSink::new());
    runtime.run(request, sink.clone()).await?;

    let events = sink.take();
    println!("events: {}", events.len());

    let finished = events
        .iter()
        .any(|e| matches!(e, AgentEvent::RunFinish { .. }));
    println!("run_finish_seen: {}", finished);

    Ok(())
}

2. Run

cargo run

3. Verify

Expected output includes:

events: <n> where n > 0
run_finish_seen: true

The event stream will contain at least RunStart, one or more TextDelta or ToolCallStart/ToolCallDone events, and a final RunFinish.

What You Created

This example creates an in-process AgentRuntime and runs one request immediately.

The core object is:

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_tool("echo", Arc::new(EchoTool))
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model("gpt-4o-mini", ModelEntry {
        provider: "openai".into(),
        model_name: "gpt-4o-mini".into(),
    })
    .build()?;

After that, the normal entry point is:

let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
let events = sink.take();

Common usage patterns:

one-shot CLI program: construct RunRequest, collect events via VecEventSink, print result
application service: wrap runtime.run(...) inside your own app logic
HTTP server: store Arc<AgentRuntime> in app state and expose protocol routes

Which Doc To Read Next

Use the next page based on what you want:

add typed state and stateful tools: First Tool
learn how events map to the agent loop: Events
expose the agent over HTTP: Expose HTTP SSE

Common Errors

Model/provider mismatch: gpt-4o-mini requires a compatible OpenAI-style provider setup.
Missing key: set OPENAI_API_KEY or DEEPSEEK_API_KEY before cargo run.
Tool not selected: ensure the prompt explicitly asks to use echo.
No RunFinish event: check that with_max_rounds is set high enough for the model to complete.

First Tool
Events
Expose HTTP SSE

First Tool

Goal

Implement one tool that reads typed state from ToolCallContext during execution.

State is optional. Many tools (API calls, search, shell commands) don’t need state – just implement execute and return a ToolResult.

Prerequisites

Complete First Agent first.
Reuse the runtime dependencies from First Agent.

[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"

1. Define a `StateKey`

A StateKey describes one named slot in the state map. It declares the value type, how updates are applied, and the lifetime scope.

use awaken::{StateKey, KeyScope, MergeStrategy};

/// Tracks how many times the greeting tool has been called.
struct GreetCount;

impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;

    type Value = u32;
    type Update = u32;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}
fn main() {}

Key choices:

KeyScope::Run – state resets at the start of each run. Use KeyScope::Thread to persist across runs.
MergeStrategy::Commutative – safe for concurrent updates. Use Exclusive when only one writer is expected.
apply defines how an Update modifies the current Value. Here it increments.

2. Implement the Tool

The tool reads the current count via ctx.state::<GreetCount>() and returns a personalized greeting.

use awaken::{StateKey, KeyScope, MergeStrategy};
struct GreetCount;
impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;
    type Value = u32;
    type Update = u32;
    fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};

struct GreetTool;

#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "Greet", "Greet a user by name")
            .with_parameters(json!({
                "type": "object",
                "properties": {
                    "name": { "type": "string", "description": "Name to greet" }
                },
                "required": ["name"]
            }))
    }

    fn validate_args(&self, args: &Value) -> Result<(), ToolError> {
        args["name"]
            .as_str()
            .filter(|s| !s.is_empty())
            .ok_or_else(|| ToolError::InvalidArguments("name is required".into()))?;
        Ok(())
    }

    async fn execute(
        &self,
        args: Value,
        ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        let name = args["name"].as_str().unwrap_or("world");

        // Read state -- returns None if the key has not been set yet.
        let count = ctx.state::<GreetCount>().copied().unwrap_or(0);

        Ok(ToolResult::success("greet", json!({
            "greeting": format!("Hello, {}!", name),
            "times_greeted": count,
        })).into())
    }
}
fn main() {}

3. Register the Tool

use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::{StateKey, KeyScope, MergeStrategy};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
struct GreetCount;
impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;
    type Value = u32;
    type Update = u32;
    fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "Greet", "Greet a user by name")
    }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        Ok(ToolResult::success("greet", json!({})).into())
    }
}
use awaken::registry_spec::AgentSpec;
use awaken::AgentRuntimeBuilder;

fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent_spec = AgentSpec::new("assistant")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant. Use the greet tool when asked.")
    .with_max_rounds(5);

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_tool("greet", Arc::new(GreetTool))
    .build()?;
Ok(())
}

4. Run

use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::{StateKey, KeyScope, MergeStrategy};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::registry_spec::AgentSpec;
use awaken::AgentRuntimeBuilder;
struct GreetCount;
impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;
    type Value = u32;
    type Update = u32;
    fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "Greet", "Greet a user by name")
    }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        Ok(ToolResult::success("greet", json!({})).into())
    }
}
use awaken::contract::message::Message;
use awaken::contract::event_sink::VecEventSink;
use awaken::RunRequest;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(AgentSpec::new("assistant").with_model("gpt-4o-mini"))
    .with_tool("greet", Arc::new(GreetTool))
    .build()?;
let request = RunRequest::new(
    "thread-1",
    vec![Message::user("Greet Alice")],
)
.with_agent_id("assistant");

let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
Ok(())
}

5. Verify

Check the collected events for a ToolCallDone event with name == "greet":

use awaken::contract::event::AgentEvent;

let events = sink.take();
let tool_done = events.iter().any(|e| matches!(
    e,
    AgentEvent::ToolCallDone { id: _, message_id: _, result: _, outcome: _ }
));
println!("tool_call_done_seen: {}", tool_done);

Expected:

tool_call_done_seen: true
The result inside ToolCallDone contains greeting and times_greeted fields.

What You Created

A tool that:

Declares a JSON Schema for its arguments via descriptor().
Validates arguments before execution via validate_args().
Reads typed state from the snapshot via ctx.state::<K>().
Returns structured JSON via ToolResult::success().

The StateKey trait gives you type-safe, scoped state without raw JSON manipulation.

Which Doc To Read Next

understand the full tool lifecycle: Tool Trait
add plugins that manage state across runs: Add a Plugin
learn about state scoping rules: State Keys

Common Errors

ctx.state::<K>() returns None: the state key has not been written yet in this run. Use .unwrap_or_default() or .copied().unwrap_or(0) for numeric defaults.
StateError::KeyEncode / StateError::KeyDecode: the Value type does not round-trip through JSON. Ensure Serialize and Deserialize are derived correctly.
ToolError::InvalidArguments not surfaced: validate_args is called before execute by the runtime. If you skip validation, bad input reaches execute and may panic on .unwrap().
Scope mismatch: KeyScope::Run state is cleared between runs. If you expect persistence, use KeyScope::Thread.

Build Agents

This path is for composing agent behavior after you understand the basics.

Recommended order

Build an Agent to define the runtime, model registry, and agent spec.
Add a Tool and Add a Plugin to extend behavior safely.
Add discovery and delegation layers with Use Skills Subsystem, Use MCP Tools, and Use Agent Handoff.
Add specialized behavior with Use Reminder Plugin, Use Generative UI, and Use Deferred Tools.

Keep nearby

Tool Trait for exact tool contracts.
Tool and Plugin Boundary for extension design decisions.
Architecture when you need the full runtime model.

Build an Agent

Use this when you need to assemble an agent with tools, persistence, and a provider into a running AgentRuntime.

Prerequisites

awaken crate added to Cargo.toml
An LlmExecutor implementation (provider) available
Familiarity with AgentSpec and AgentRuntimeBuilder

Steps

Define the agent spec.

use awaken::engine::GenaiExecutor;
use awaken::{AgentSpec, AgentRuntimeBuilder, ModelSpec};

let spec = AgentSpec::new("assistant")
    .with_model("claude-sonnet")
    .with_system_prompt("You are a helpful assistant.")
    .with_max_rounds(10);

use std::sync::Arc;

let builder = AgentRuntimeBuilder::new()
    .with_agent_spec(spec)
    .with_tool("search", Arc::new(SearchTool))
    .with_tool("calculator", Arc::new(CalculatorTool));

let builder = builder
    .with_provider("anthropic", Arc::new(GenaiExecutor::new()))
    .with_model("claude-sonnet", ModelSpec {
        id: "claude-sonnet".into(),
        provider: "anthropic".into(),
        model: "claude-sonnet-4-20250514".into(),
    });

Attach persistence.

use awaken::stores::InMemoryStore;

let store = Arc::new(InMemoryStore::new());
let builder = builder.with_thread_run_store(store);

Build and validate.

let runtime = builder.build()?;

build resolves every registered agent and catches missing models, providers, or plugins at startup rather than at request time.

Execute a run.

use std::sync::Arc;
use awaken::RunRequest;
use awaken::contract::event_sink::VecEventSink;

let request = RunRequest::new("thread-1", vec![user_message])
    .with_agent_id("assistant");

let sink = Arc::new(VecEventSink::new());
let handle = runtime.run(request, sink.clone()).await?;

Verify

Call the /health endpoint (if using the server feature) or inspect the returned AgentRunResult to confirm the agent loop completed without errors.

Common Errors

Error	Cause	Fix
`BuildError::ValidationFailed`	Agent spec references a model or provider not registered in the builder	Register the missing model/provider before calling `build`
`BuildError::State`	Duplicate state key registration across plugins	Ensure each `StateKey` is registered by exactly one plugin
`RuntimeError` at run time	Provider returns an inference error	Check provider credentials and model ID

examples/src/research/ – a research agent with search and report-writing tools.

Key Files

crates/awaken-runtime/src/builder.rs – AgentRuntimeBuilder
crates/awaken-contract/src/registry_spec.rs – AgentSpec
crates/awaken-runtime/src/runtime/agent_runtime/mod.rs – AgentRuntime

Add a Tool
Add a Plugin
Use File Store
Expose HTTP with SSE

Add a Tool

Use this when you need to expose a custom capability to the agent by implementing the Tool trait.

Prerequisites

awaken crate added to Cargo.toml
async-trait and serde_json available

Steps

Implement the Tool trait.

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use serde_json::{Value, json};
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};

async fn fetch_weather(_city: &str) -> Result<String, ToolError> {
    Ok("Sunny, 22°C".to_string())
}

pub struct WeatherTool;

#[async_trait]
impl Tool for WeatherTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("get_weather", "Get Weather", "Fetch current weather for a city")
            .with_parameters(json!({
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["city"]
            }))
    }

    async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        let city = args["city"]
            .as_str()
            .ok_or_else(|| ToolError::InvalidArguments("Missing 'city'".into()))?;

        let weather = fetch_weather(city).await?;

        Ok(ToolResult::success("get_weather", json!({ "forecast": weather })).into())
    }
}
}

Optionally override argument validation.

fn validate_args(&self, args: &Value) -> Result<(), ToolError> {
    if !args.get("city").and_then(|v| v.as_str()).is_some_and(|s| !s.is_empty()) {
        return Err(ToolError::InvalidArguments("'city' must be a non-empty string".into()));
    }
    Ok(())
}

validate_args runs before execute and lets you reject malformed input early.

use std::sync::Arc;
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_tool("get_weather", Arc::new(WeatherTool))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

The string ID passed to with_tool must match the id in ToolDescriptor::new.

Register via a plugin (alternative).

Tools can also be registered inside a Plugin::register method through the PluginRegistrar:

fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError> {
    registrar.register_tool("get_weather", Arc::new(WeatherTool))?;
    Ok(())
}

Plugin-registered tools are scoped to agents that activate that plugin.

Verify

Send a message that should trigger the tool. Inspect the run result to confirm the tool was called and returned the expected output.

Common Errors

Error	Cause	Fix
`ToolError::InvalidArguments`	The LLM passed malformed JSON	Tighten the JSON Schema in `with_parameters` to guide the model
Tool never called	Descriptor `id` does not match the registered ID	Ensure the ID in `ToolDescriptor::new` and `with_tool` are identical
`ToolError::ExecutionFailed`	Runtime error inside `execute`	Return a descriptive error; the agent will see it and may retry

examples/src/research/tools.rs – SearchTool and WriteReportTool implementations.

Key Files

crates/awaken-contract/src/contract/tool.rs – Tool trait, ToolDescriptor, ToolResult, ToolError
crates/awaken-runtime/src/builder.rs – with_tool registration

Build an Agent
Add a Plugin

Add a Plugin

Use this when you need to extend the agent lifecycle with state keys, phase hooks, scheduled actions, or effect handlers.

Prerequisites

awaken crate added to Cargo.toml
Familiarity with Phase variants and StateKey

Steps

Define a state key.

#![allow(unused)]
fn main() {
use awaken::{StateKey, MergeStrategy, StateError, JsonValue};
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct AuditLog {
    pub entries: Vec<String>,
}

pub struct AuditLogKey;

impl StateKey for AuditLogKey {
    type Value = AuditLog;
    const KEY: &'static str = "audit_log";
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;

    type Update = AuditLog;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value = update;
    }

    fn encode(value: &Self::Value) -> Result<JsonValue, StateError> {
        serde_json::to_value(value).map_err(|e| StateError::KeyEncode { key: Self::KEY.into(), message: e.to_string() })
    }

    fn decode(json: JsonValue) -> Result<Self::Value, StateError> {
        serde_json::from_value(json).map_err(|e| StateError::KeyDecode { key: Self::KEY.into(), message: e.to_string() })
    }
}
}

Implement a phase hook.

use async_trait::async_trait;
use awaken::{PhaseHook, PhaseContext, StateCommand, StateError};

pub struct AuditHook;

#[async_trait]
impl PhaseHook for AuditHook {
    async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
        let mut log = ctx.state::<AuditLogKey>().cloned().unwrap_or(AuditLog {
            entries: Vec::new(),
        });
        log.entries.push(format!("Phase executed at {:?}", ctx.phase));
        let mut cmd = StateCommand::new();
        cmd.update::<AuditLogKey>(log);
        Ok(cmd)
    }
}

Implement the Plugin trait.

use awaken::{Plugin, PluginDescriptor, PluginRegistrar, Phase, StateError, StateKeyOptions, KeyScope};

pub struct AuditPlugin;

impl Plugin for AuditPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor { name: "audit" }
    }

    fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError> {
        registrar.register_key::<AuditLogKey>(StateKeyOptions {
            scope: KeyScope::Run,
            ..Default::default()
        })?;

        registrar.register_phase_hook(
            "audit",
            Phase::PostInference,
            AuditHook,
        )?;

        Ok(())
    }
}

use std::sync::Arc;
use awaken::{AgentSpec, AgentRuntimeBuilder};

let spec = AgentSpec::new("assistant")
    .with_model("anthropic/claude-sonnet")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("audit");

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("audit", Arc::new(AuditPlugin))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

with_hook_filter on the agent spec activates the named plugin for that agent.

Verify

Run the agent and inspect the state snapshot. The audit_log key should contain entries added by the hook after each inference phase.

Common Errors

Error	Cause	Fix
`StateError::KeyAlreadyRegistered`	Two plugins register the same `StateKey`	Use a unique `KEY` constant per state key
`StateError::UnknownKey`	Accessing a key that was never registered	Ensure the plugin calling `register_key` is activated on the agent
Hook not firing	Plugin ID not listed in `with_hook_filter`	Add the plugin ID to the agent spec’s hook filters

crates/awaken-ext-observability/ – the built-in observability plugin registers phase hooks and state keys.

Key Files

crates/awaken-runtime/src/plugins/lifecycle.rs – Plugin trait
crates/awaken-runtime/src/plugins/registry.rs – PluginRegistrar
crates/awaken-runtime/src/hooks/phase_hook.rs – PhaseHook trait

Build an Agent
Add a Tool

Use Skills Subsystem

Use this when you want agents to discover and activate skill packages at runtime, loading instructions, resources, and scripts on demand.

Prerequisites

A working awaken agent runtime (see First Agent)
Feature skills enabled on the awaken crate

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["skills"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

Create skill directories.

Each skill lives in a directory with a SKILL.md file containing YAML frontmatter:

skills/
  my-skill/
    SKILL.md
  another-skill/
    SKILL.md

Example skills/my-skill/SKILL.md:

---
name: my-skill
description: Helps with data analysis tasks
allowed-tools: read_file web_search
---
# My Skill

When this skill is active, focus on data analysis.
Use read_file to load datasets and web_search for reference material.

Discover filesystem skills.

use std::sync::Arc;
use awaken::ext_skills::{FsSkill, InMemorySkillRegistry, SkillSubsystem};

let result = FsSkill::discover("./skills").expect("skill discovery failed");
for warning in &result.warnings {
    eprintln!("skill warning: {warning:?}");
}

let skills = FsSkill::into_arc_skills(result.skills);
let registry = Arc::new(InMemorySkillRegistry::from_skills(skills));

Use embedded skills (alternative).

For compile-time embedded skills, use EmbeddedSkill:

use std::sync::Arc;
use awaken::ext_skills::{EmbeddedSkill, EmbeddedSkillData, InMemorySkillRegistry};

const SKILL_MD: &str = "\
---
name: builtin-skill
description: A compiled-in skill
allowed-tools: read_file
---
Builtin Skill

Follow these instructions when active.
";

let skill = EmbeddedSkill::new(&EmbeddedSkillData {
    skill_md: SKILL_MD,
    references: &[],
    assets: &[],
}).expect("valid skill");

let registry = Arc::new(InMemorySkillRegistry::from_skills(vec![Arc::new(skill)]));

Wire into the runtime.

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_skills::SkillSubsystem;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let subsystem = SkillSubsystem::new(registry);
let agent_spec = AgentSpec::new("skills-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Discover and activate skills when specialized help is useful.")
    .with_hook_filter("skills-discovery")
    .with_hook_filter("skills-active-instructions");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin(
        "skills-discovery",
        Arc::new(subsystem.discovery_plugin()) as Arc<dyn Plugin>,
    )
    .with_plugin(
        "skills-active-instructions",
        Arc::new(subsystem.active_instructions_plugin()) as Arc<dyn Plugin>,
    )
    .build()
    .expect("failed to build runtime");

The SkillDiscoveryPlugin injects a skills catalog into the LLM context before inference and registers three tools:

Tool	Purpose
`skill`	Activate a skill by name
`load_skill_resource`	Load a skill resource or reference
`skill_script`	Run a skill script

Verify

Run the agent and ask it to list available skills.
The LLM should see the skills catalog in its context and use the skill tool to activate one.
After activation, the ActiveSkillInstructionsPlugin injects the skill instructions into subsequent inference calls.

Common Errors

Symptom	Cause	Fix
No skills discovered	Wrong directory path	Ensure each skill has a `SKILL.md` in a subdirectory
`SkillMaterializeError`	Invalid YAML frontmatter	Check that `name` and `description` fields are present in `SKILL.md`
Skill tools not available	Plugin not registered	Register both `discovery_plugin()` and `active_instructions_plugin()`
Feature not found	Missing cargo feature	Enable `features = ["skills"]` in `Cargo.toml`

crates/awaken-ext-skills/tests/

Key Files

Path	Purpose
`crates/awaken-ext-skills/src/lib.rs`	Module root and public re-exports
`crates/awaken-ext-skills/src/registry.rs`	`FsSkill`, `InMemorySkillRegistry`, `SkillRegistry` trait
`crates/awaken-ext-skills/src/plugin/subsystem.rs`	`SkillSubsystem` facade
`crates/awaken-ext-skills/src/plugin/discovery.rs`	`SkillDiscoveryPlugin`
`crates/awaken-ext-skills/src/embedded.rs`	`EmbeddedSkill` for compile-time skills
`crates/awaken-ext-skills/src/tools.rs`	Skill tool implementations

Add a Tool
Add a Plugin
Use MCP Tools

Use MCP Tools

Use this when you want to connect to external Model Context Protocol (MCP) servers and expose their tools to awaken agents.

Prerequisites

A working awaken agent runtime (see First Agent)
Feature mcp enabled on the awaken crate
An MCP server to connect to (stdio or HTTP transport)

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["mcp"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

Configure MCP server connections.

use awaken::ext_mcp::McpServerConnectionConfig;

// Stdio transport: launch a child process
let stdio_config = McpServerConnectionConfig::stdio(
    "my-mcp-server",          // server name
    "node",                    // command
    vec!["server.js".into()],  // args
);

// HTTP/SSE transport: connect to a running server
let http_config = McpServerConnectionConfig::http(
    "remote-server",
    "http://localhost:8080/sse",
);

Create the registry manager and discover tools.

use awaken::ext_mcp::McpToolRegistryManager;

let manager = McpToolRegistryManager::connect(vec![stdio_config, http_config])
    .await
    .expect("failed to connect MCP servers");

// Tools are now available as awaken Tool instances.
// Each MCP tool is registered with the ID format: mcp__{server}__{tool}
let registry = manager.registry();
for id in registry.ids() {
    println!("discovered: {id}");
}

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_mcp::McpPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let agent_spec = AgentSpec::new("mcp-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Use MCP tools when they help answer the user.")
    .with_hook_filter("mcp");

let mut builder = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("mcp", Arc::new(McpPlugin) as Arc<dyn Plugin>);

let registry = manager.registry();
for (id, tool) in registry.snapshot() {
    builder = builder.with_tool(&id, tool);
}

let runtime = builder.build().expect("failed to build runtime");

Enable periodic refresh (optional).

MCP servers may add or remove tools at runtime. Enable periodic refresh to keep the tool registry in sync:

use std::time::Duration;

manager.start_periodic_refresh(Duration::from_secs(60));

Verify

Run the agent and ask it to use a tool provided by the MCP server.
Check the backend logs for MCP tool call events.
Confirm the tool result includes mcp.server and mcp.tool metadata in the response.

Common Errors

Symptom	Cause	Fix
`McpError::TransportError`	MCP server not running or unreachable	Verify the server process is running and the path/URL is correct
No tools discovered	Server returned empty tool list	Check the MCP server implements `tools/list`
Tool call timeout	Server too slow to respond	Increase timeout in the transport configuration
Feature not found	Missing cargo feature	Enable `features = ["mcp"]` in `Cargo.toml`
`mcp__server__tool` not found	Tools not registered with builder	Loop over `manager.registry().snapshot()` and call `with_tool` for each

crates/awaken-ext-mcp/tests/

Key Files

Path	Purpose
`crates/awaken-ext-mcp/src/lib.rs`	Module root and public re-exports
`crates/awaken-ext-mcp/src/manager.rs`	`McpToolRegistryManager` lifecycle and tool wrapping
`crates/awaken-ext-mcp/src/config.rs`	`McpServerConnectionConfig` transport types
`crates/awaken-ext-mcp/src/plugin.rs`	`McpPlugin` integration with awaken plugin system
`crates/awaken-ext-mcp/src/transport.rs`	`McpToolTransport` trait and transport helpers
`crates/awaken-ext-mcp/tests/mcp_tests.rs`	Integration tests

Add a Tool
Add a Plugin
Use Skills Subsystem

Use the Reminder Plugin

Use this when you want the agent to receive automatic context messages after tool execution based on pattern matching – for example, reminding the agent to run cargo check after editing a .toml file, or warning about destructive commands.

Prerequisites

A working awaken agent runtime (see Build an Agent)
Feature reminder enabled on the awaken crate

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["reminder"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

use std::sync::Arc;
use awaken::{AgentRuntimeBuilder, Plugin};
use awaken::ext_reminder::{ReminderPlugin, ReminderRulesConfig};

let json = r#"{
    "rules": [
        {
            "tool": "Bash(command ~ 'rm *')",
            "output": { "status": "success" },
            "message": {
                "target": "suffix_system",
                "content": "A deletion command just succeeded. Verify the result."
            }
        }
    ]
}"#;

let config = ReminderRulesConfig::from_str(json, Some("json"))
    .expect("failed to parse reminder config");
let rules = config.into_rules().expect("invalid rules");

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_plugin("reminder", Arc::new(ReminderPlugin::new(rules)) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

The plugin registers an AfterToolExecute phase hook. After every tool call, it evaluates each rule against the tool name, arguments, and result. When a rule matches, it schedules an AddContextMessage action that injects the configured message into the prompt.

Define rules with tool patterns.

The tool field uses the same pattern DSL as the permission system:

Pattern	Matches
`"Bash"`	Exact tool name `Bash`
`"*"`	Any tool
`"mcp__*"`	Glob on tool name (all MCP tools)
`"Bash(command ~ 'rm *')"`	Tool name + primary argument glob
`"Edit(file_path ~ '*.toml')"`	Named field glob

let json = r#"{
    "rules": [
        {
            "name": "toml-edit-reminder",
            "tool": "Edit(file_path ~ '*.toml')",
            "output": "any",
            "message": {
                "target": "system",
                "content": "You edited a TOML file. Run cargo check to verify.",
                "cooldown_turns": 3
            }
        }
    ]
}"#;

The optional name field gives the rule a human-readable identifier. When omitted, one is auto-generated from the index and tool pattern (e.g. rule-0-Edit(file_path ~ '*.toml')).

Configure output matching.

The output field controls whether the rule fires based on the tool result. Set it to "any" to match all outputs, or use a structured object with status and/or content:

// Match only errors containing "permission denied"
let json = r#"{
    "rules": [
        {
            "tool": "*",
            "output": {
                "status": "error",
                "content": "*permission denied*"
            },
            "message": {
                "target": "suffix_system",
                "content": "Permission denied. Consider using sudo or checking file ownership."
            }
        }
    ]
}"#;

Status values: "success", "error", "pending", "any".

Content matching supports two forms:

Text glob (string shorthand): "*permission denied*" – matches the stringified tool output against a glob pattern.
JSON fields (structured): matches specific fields in JSON output.

// JSON field matching
let json = r#"{
    "rules": [
        {
            "tool": "*",
            "output": {
                "status": "error",
                "content": {
                    "fields": [
                        { "path": "error.code", "op": "exact", "value": "403" }
                    ]
                }
            },
            "message": {
                "target": "suffix_system",
                "content": "HTTP 403 Forbidden. Check authentication credentials."
            }
        }
    ]
}"#;

Field match operations: "glob" (default), "exact", "regex", "not_glob", "not_exact", "not_regex". The path uses dot-separated JSON field names (e.g. "error.code"). When multiple fields are specified, all must match (AND logic).

When both status and content are present, both must match for the rule to fire.

Choose a message target.

The target field determines where the injected message appears in the prompt:

Target	Placement
`"system"`	Prepended to the system prompt section
`"suffix_system"`	Appended after the system prompt
`"session"`	Inserted as a session-scoped system message
`"conversation"`	Inserted as a conversation-scoped system message

Use cooldown to avoid repetition.

Set cooldown_turns on the message to suppress re-injection for a number of turns after firing:

let json = r#"{
    "rules": [
        {
            "tool": "*",
            "output": "any",
            "message": {
                "target": "system",
                "content": "Remember to be careful with file operations.",
                "cooldown_turns": 5
            }
        }
    ]
}"#;

When cooldown_turns is 0 (default), the message is injected on every match.

Load rules from a file.

use awaken::ext_reminder::ReminderRulesConfig;

let config = ReminderRulesConfig::from_file("reminders.json")
    .expect("failed to load reminder config");
let rules = config.into_rules().expect("invalid rules");

Activate the plugin on an agent spec.

use awaken::registry_spec::AgentSpec;

let agent_spec = AgentSpec::new("my-agent")
    .with_model("anthropic/claude-sonnet")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("reminder");

The with_hook_filter("reminder") call activates the reminder plugin’s phase hook for this agent.

Verify

Configure a rule matching a tool you can easily trigger (e.g. "*" with "any" output).
Run the agent and invoke a tool.
Inspect the prompt or enable tracing at debug level. You should see:
```
reminder rule matched, scheduling context message
```
Confirm the context message appears in the agent’s next inference prompt at the expected target location.

Common Errors

Symptom	Cause	Fix
`InvalidPattern` on startup	Malformed tool pattern string	Check syntax against the pattern table above; ensure quotes are escaped in JSON
`InvalidTarget` on startup	Unknown message target	Use one of: `system`, `suffix_system`, `session`, `conversation`
`InvalidOutput` on startup	Unrecognized output matcher string	Use `"any"` or a structured `{ "status": ..., "content": ... }` object
`InvalidOp` on startup	Unknown field match operation	Use one of: `glob`, `exact`, `regex`, `not_glob`, `not_exact`, `not_regex`
Rule never fires	Plugin not activated on agent	Add `with_hook_filter("reminder")` to the agent spec
Rule fires too often	No cooldown configured	Set `cooldown_turns` to a positive value

crates/awaken-ext-reminder/src/config.rs – contains test cases with various rule configurations

Key Files

Path	Purpose
`crates/awaken-ext-reminder/src/lib.rs`	Module root and public re-exports
`crates/awaken-ext-reminder/src/config.rs`	`ReminderRulesConfig`, JSON loading, `ReminderConfigKey`
`crates/awaken-ext-reminder/src/rule.rs`	`ReminderRule` struct definition
`crates/awaken-ext-reminder/src/output_matcher.rs`	`OutputMatcher`, `ContentMatcher`, status/content matching logic
`crates/awaken-ext-reminder/src/plugin/plugin.rs`	`ReminderPlugin` registration (`AfterToolExecute` hook)
`crates/awaken-ext-reminder/src/plugin/hook.rs`	`ReminderHook` – pattern + output evaluation per tool call
`crates/awaken-tool-pattern/`	Shared glob/regex pattern matching library

Enable Tool Permission HITL – uses the same tool pattern DSL
Add a Plugin
Build an Agent

Use Generative UI (A2UI)

Use this when you want the agent to send declarative UI components to a frontend – for example, rendering a form, a data table, or an interactive card without the frontend knowing the layout in advance.

Prerequisites

A working awaken agent runtime (see Build an Agent)
A frontend that consumes A2UI messages from the event stream (e.g. a CopilotKit or AI SDK integration)
A component catalog registered on the frontend that defines available UI components

[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_generative_ui::A2uiPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let plugin = A2uiPlugin::with_catalog_id("my-catalog");
let agent_spec = AgentSpec::new("ui-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Render structured UI when visual output helps.")
    .with_hook_filter("generative-ui");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("generative-ui", Arc::new(plugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

The plugin registers a tool called render_a2ui that the LLM can invoke. When the LLM calls this tool with an array of A2UI messages, the tool validates the message structure and returns the validated payload, which flows through the event stream to the frontend.

Understand the A2UI protocol.

A2UI v0.8 defines four message types. Each message is a JSON object with "version": "v0.8" and exactly one of the following keys:

Message Type	Purpose
`createSurface`	Initialize a new rendering surface
`updateComponents`	Define or update the component tree
`updateDataModel`	Populate or change data values
`deleteSurface`	Remove a surface

Messages must be sent in order: create the surface first, then define components, then populate data.

Create a surface.

Every UI starts by creating a surface with a unique ID and a reference to the frontend’s component catalog:

// The LLM sends this via the render_a2ui tool:
let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "createSurface": {
                "surfaceId": "order-form-1",
                "catalogId": "my-catalog"
            }
        }
    ]
});

The catalogId tells the frontend which set of component definitions to use for rendering.

Define the component tree.

Components are specified as a flat list with adjacency references. Each component has a unique id and a component type name from the catalog. Parent-child relationships are expressed via child (single child) or children (multiple children):

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "updateComponents": {
                "surfaceId": "order-form-1",
                "components": [
                    {
                        "id": "root",
                        "component": "Card",
                        "child": "layout"
                    },
                    {
                        "id": "layout",
                        "component": "Column",
                        "children": ["title", "name-field", "submit-btn"]
                    },
                    {
                        "id": "title",
                        "component": "Text",
                        "text": "New Order"
                    },
                    {
                        "id": "name-field",
                        "component": "TextField",
                        "label": "Customer Name",
                        "value": { "path": "/customer/name" }
                    },
                    {
                        "id": "submit-btn",
                        "component": "Button",
                        "label": "Submit",
                        "action": {
                            "event": {
                                "name": "submit-order",
                                "context": { "formId": "order-form-1" }
                            }
                        }
                    }
                ]
            }
        }
    ]
});

Rules for the component list:

One component must have "id": "root" – this is the tree’s entry point.
Every component requires "id" and "component" fields.
Use "child" for a single child reference or "children" for multiple.
Additional properties (text, label, action, and any extra fields) are passed through to the frontend renderer.

Bind data with JSON paths.

Use {"path": "/json/pointer"} in component properties to bind values to the surface’s data model. The frontend resolves these paths against the data model at render time:

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "updateDataModel": {
                "surfaceId": "order-form-1",
                "path": "/",
                "value": {
                    "customer": {
                        "name": "",
                        "email": ""
                    },
                    "items": []
                }
            }
        }
    ]
});

The path field specifies which part of the data model to update. Use "/" to set the entire model, or a nested path like "/customer/name" to update a specific value.

Delete a surface.

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "deleteSurface": {
                "surfaceId": "order-form-1"
            }
        }
    ]
});

Send multiple messages in one tool call.

The render_a2ui tool accepts an array of messages, so you can create a surface, define components, and populate data in a single call:

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "createSurface": { "surfaceId": "s1", "catalogId": "my-catalog" }
        },
        {
            "version": "v0.8",
            "updateComponents": {
                "surfaceId": "s1",
                "components": [
                    { "id": "root", "component": "Text", "text": "Hello" }
                ]
            }
        },
        {
            "version": "v0.8",
            "updateDataModel": { "surfaceId": "s1", "path": "/", "value": {} }
        }
    ]
});

Customize plugin instructions.

The plugin injects prompt instructions that teach the LLM how to use the render_a2ui tool. You can customize these in several ways:

// With catalog ID and custom examples appended to the default instructions
let plugin = A2uiPlugin::with_catalog_and_examples(
    "my-catalog",
    "Example: create a card with a title and a button..."
);

// With fully custom instructions (replaces the default instructions entirely)
let plugin = A2uiPlugin::with_custom_instructions(
    "You can render UI by calling render_a2ui...".to_string()
);

Verify

Register the A2UI plugin and run the agent with a prompt that asks it to display information visually.
The agent should call the render_a2ui tool with valid A2UI messages.
Check the tool result in the event stream – a successful call returns {"a2ui": [...], "rendered": true}.
On the frontend, confirm the surface appears with the expected components.

Common Errors

Symptom	Cause	Fix
`missing required field "messages"`	Tool called without a `messages` array	Ensure the LLM sends `{"messages": [...]}`
`messages array must not be empty`	Empty messages array	Include at least one A2UI message
`unsupported version`	Version field is not `"v0.8"`	Set `"version": "v0.8"` on every message
`multiple message types in one object`	A single message contains more than one type key	Each message object must have exactly one of `createSurface`, `updateComponents`, `updateDataModel`, or `deleteSurface`
`components[N].id is required`	A component in `updateComponents` is missing `id`	Add `"id"` to every component object
`components[N].component is required`	A component is missing the type name	Add `"component"` with a valid catalog type
LLM does not call the tool	Plugin registered but instructions not reaching the LLM	Verify the plugin is activated on the agent spec

crates/awaken-ext-generative-ui/src/a2ui/tests.rs – validation and tool execution test cases

Key Files

Path	Purpose
`crates/awaken-ext-generative-ui/src/a2ui/mod.rs`	A2UI module root, constants, re-exports
`crates/awaken-ext-generative-ui/src/a2ui/plugin.rs`	`A2uiPlugin` registration and prompt instructions
`crates/awaken-ext-generative-ui/src/a2ui/tool.rs`	`A2uiRenderTool` – validation and execution
`crates/awaken-ext-generative-ui/src/a2ui/types.rs`	`A2uiMessage`, `A2uiComponent`, and related structs
`crates/awaken-ext-generative-ui/src/a2ui/validation.rs`	`validate_a2ui_messages` structural checks

Integrate CopilotKit / AG-UI
Integrate AI SDK Frontend
Add a Plugin

Use Agent Handoff

Use this when you need to dynamically switch an agent’s behavior (system prompt, model, or tool set) within a running agent loop without terminating the run or spawning a new thread.

Prerequisites

awaken crate added to Cargo.toml
Familiarity with Plugin, StateKey, and AgentRuntimeBuilder

Overview

Handoff performs dynamic same-thread agent variant switching. Instead of ending the current run and starting a new agent, handoff applies an AgentOverlay that overrides parts of the base agent configuration. The switch is instant – no run termination or re-resolution occurs.

Key types:

HandoffPlugin – the plugin that manages overlay registration and lifecycle hooks.
AgentOverlay – per-variant overrides for system prompt, model, and tool filters.
HandoffState – tracks the active variant and any pending handoff request.
HandoffAction – reducer actions: Request, Activate, Clear.

Steps

Define agent variant overlays.

Each overlay specifies which parts of the base agent configuration to override. Fields left as None inherit the base agent’s values.

The model field uses the same model registry ID that AgentSpec.model uses.

use awaken::extensions::handoff::AgentOverlay;

let researcher = AgentOverlay {
    system_prompt: Some("You are a research specialist. Find and cite sources.".into()),
    model: Some("claude-sonnet".into()),
    allowed_tools: Some(vec!["web_search".into(), "read_document".into()]),
    excluded_tools: None,
};

let writer = AgentOverlay {
    system_prompt: Some("You are a technical writer. Produce clear documentation.".into()),
    model: None, // inherits base model
    allowed_tools: None, // all tools available
    excluded_tools: Some(vec!["web_search".into()]),
};

Build a HandoffPlugin with variant overlays.

use std::collections::HashMap;
use std::sync::Arc;
use awaken::extensions::handoff::HandoffPlugin;

let mut overlays = HashMap::new();
overlays.insert("researcher".to_string(), researcher);
overlays.insert("writer".to_string(), writer);

let handoff = HandoffPlugin::new(overlays);

use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("agent_handoff", Arc::new(handoff))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

The plugin ID must be "agent_handoff" (exported as HANDOFF_PLUGIN_ID). The plugin registers hooks on Phase::RunStart and Phase::StepEnd to synchronize handoff state.

Request a handoff from within a tool or hook.

Use the action helpers to create HandoffAction mutations and dispatch them through a StateCommand:

use awaken::extensions::handoff::{request_handoff, activate_handoff, clear_handoff, ActiveAgentKey};
use awaken::state::StateCommand;

// Request a switch to the "researcher" variant (pending until next phase boundary)
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(request_handoff("researcher"));

// Directly activate a variant (skips the request step)
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(activate_handoff("writer"));

// Clear handoff state and return to the base agent
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(clear_handoff());

Look up overlays from plugin state.

The plugin exposes a method to retrieve the overlay for any registered variant:

let overlay = handoff.overlay("researcher");
// Returns Option<&AgentOverlay>

The effective agent ID is determined by HandoffPlugin::effective_agent, which returns the requested variant if one is pending, otherwise the currently active variant:

let state: &HandoffState = /* from context */;
let agent_id = HandoffPlugin::effective_agent(state);
// Returns Option<&String> -- None means the base agent is active

How It Works

HandoffState has two fields:

active_agent: Option<String> – the currently active variant (None = base agent).
requested_agent: Option<String> – a pending handoff request, consumed at the next phase boundary.

The internal HandoffSyncHook runs at RunStart and StepEnd. When it detects a requested_agent, it promotes the request to active_agent and clears the request. This two-phase approach ensures the switch happens at a safe boundary in the agent loop.

Handoff vs Delegation

	Handoff	Delegation
Thread	Same thread, same run	Spawns a sub-agent on a separate thread
State	Shared – overlays modify the current agent in-place	Isolated – delegate has its own state
Use case	Switching personas or tool sets mid-conversation	Offloading a self-contained subtask
Overhead	Zero – no run restart	Higher – new run lifecycle

Use handoff when you want the agent to change behavior while retaining conversational context. Use delegation when the subtask is independent and the delegate should not see or modify the parent’s state.

Common Errors

Error	Cause	Fix
Overlay not applied	Variant name in `request_handoff` does not match a key in the overlays map	Ensure the string matches exactly
`StateError::KeyAlreadyRegistered`	Another plugin registers the `ActiveAgentKey`	Only one `HandoffPlugin` should be registered per runtime
Hook not firing	Plugin not in the agent’s active hook filter	Add `"agent_handoff"` to `active_hook_filter` or leave the filter empty

Key Files

crates/awaken-runtime/src/extensions/handoff/mod.rs – module root and public exports
crates/awaken-runtime/src/extensions/handoff/plugin.rs – HandoffPlugin implementation
crates/awaken-runtime/src/extensions/handoff/types.rs – AgentOverlay struct
crates/awaken-runtime/src/extensions/handoff/state.rs – HandoffState and ActiveAgentKey
crates/awaken-runtime/src/extensions/handoff/action.rs – HandoffAction and helper functions

Add a Plugin
Build an Agent

Use Deferred Tools

Use this when your agent has many tools and you want to reduce context window usage by hiding tool schemas from the LLM until they are needed. The deferred-tools plugin classifies tools as Eager (always sent) or Deferred (hidden until requested). A ToolSearch tool lets the LLM discover deferred tools on demand.

Prerequisites

A working awaken agent runtime (see First Agent)
The awaken-ext-deferred-tools crate

[dependencies]
awaken-ext-deferred-tools = { version = "0.1" }
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

Create the plugin and register it.

Collect all tool descriptors your agent exposes, then pass them to DeferredToolsPlugin::new. The plugin uses these to classify tools and populate the deferred registry at activation time.

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
use awaken_ext_deferred_tools::DeferredToolsPlugin;

// Collect descriptors from all tools registered on the agent.
let seed_tools = vec![
    weather_tool.descriptor(),
    search_tool.descriptor(),
    debug_tool.descriptor(),
    // ... all tool descriptors
];
let agent_spec = AgentSpec::new("deferred-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Search for tools only when needed.")
    .with_hook_filter("ext-deferred-tools");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin(
        "ext-deferred-tools",
        Arc::new(DeferredToolsPlugin::new(seed_tools)) as Arc<dyn Plugin>,
    )
    .build()
    .expect("failed to build runtime");

Configure tool loading rules.

Set rules on the agent spec via the deferred_tools config key. Rules are evaluated in order and first match wins. Tools that match no rule fall back to default_mode.

use awaken_ext_deferred_tools::{
    DeferredToolsConfig, DeferralRule, ToolLoadMode,
};

let config = DeferredToolsConfig {
    rules: vec![
        DeferralRule { tool: "get_weather".into(), mode: ToolLoadMode::Eager },
        DeferralRule { tool: "debug_*".into(), mode: ToolLoadMode::Deferred },
    ],
    default_mode: ToolLoadMode::Deferred,
    ..Default::default()
};

The tool field supports exact names and glob patterns (via wildcard_match). Common patterns:

Pattern	Matches
`get_weather`	Exact tool ID
`debug_*`	Any tool starting with `debug_`
`mcp__github__*`	All GitHub MCP tools

Understand auto-enable.

The enabled field on DeferredToolsConfig controls activation:

Value	Behavior
`Some(true)`	Always enable deferred tools
`Some(false)`	Always disable
`None` (default)	Auto-enable when total token savings across deferred tools exceeds `beta_overhead` (default 1136 tokens)

With auto-enable, the plugin estimates per-tool schema cost as len(parameters_json) / 4 tokens and sums the savings for all deferrable tools. If the total exceeds the overhead of maintaining the ToolSearch tool and deferred tool list in context, deferral activates automatically.

Understand how ToolSearch works.

The plugin automatically registers a ToolSearch tool. The LLM calls it with a query string to find deferred tools:

Query format	Behavior
`"select:Tool1,Tool2"`	Fetch specific tools by exact ID
`"+required rest terms"`	Require a keyword, rank by remaining terms
`"plain keywords"`	General keyword search across id, name, description

When ToolSearch returns results, matched tools are promoted to Eager for the remainder of the session. The tool returns up to max_results (default 5) matching tool schemas in a <functions> block.

LLM: I need to check the weather. Let me search for relevant tools.
     -> calls ToolSearch { query: "weather forecast" }

ToolSearch returns matching schemas, promotes get_weather to Eager.
Next inference: get_weather schema is included in the tool list.

Configure automatic re-deferral (DiscBeta).

The plugin tracks per-tool usage statistics using a discounted Beta distribution. Tools that were promoted via ToolSearch but stop being used get automatically re-deferred. This is adaptive and requires no manual tuning.

Re-deferral triggers when all of the following are true:

The tool is currently Eager (was promoted from Deferred)
The tool is not configured as always-Eager in rules
The tool has been idle for at least defer_after turns
The posterior upper credible interval (90%) falls below breakeven_p * thresh_mult

The breakeven frequency is (c - c_bar) / gamma, where c is the full schema cost and c_bar is the name-only cost. A tool is worth keeping eager only if it is used often enough that the savings from avoiding ToolSearch calls exceed the per-turn overhead.

Key parameters in DiscBetaParams:

Parameter	Default	Purpose
`omega`	0.95	Discount factor per turn. Effective memory is approximately `1/(1-omega)` = 20 turns
`n0`	5.0	Prior strength in equivalent observations
`defer_after`	5	Minimum idle turns before considering re-deferral
`thresh_mult`	0.5	Multiplier on breakeven frequency for the deferral threshold
`gamma`	2000.0	Estimated token cost of a ToolSearch call

These live under DeferredToolsConfig.disc_beta:

use awaken_ext_deferred_tools::{DeferredToolsConfig, DiscBetaParams};

let config = DeferredToolsConfig {
    disc_beta: DiscBetaParams {
        omega: 0.95,
        n0: 5.0,
        defer_after: 5,
        thresh_mult: 0.5,
        gamma: 2000.0,
    },
    beta_overhead: 1136.0,
    ..Default::default()
};

Enable cross-session learning.

Via AgentToolPriors (a ProfileKey), usage frequencies persist across sessions using EWMA (exponentially weighted moving average). At session end, the PersistPriorsHook writes per-tool presence frequencies to the profile store. At next session start, the LoadPriorsHook reads them back and initializes the Beta distribution with learned priors instead of the default 0.01.

This requires a ProfileStore to be configured on the runtime. No additional code is needed beyond the plugin registration — the hooks are wired automatically.

The EWMA smoothing factor is lambda = max(0.1, 1/(n+1)), where n is the session count. Early sessions contribute equally; after 10 sessions the factor stabilizes at 0.1, giving 90% weight to historical data.

Verify

Run the agent and trigger an inference. Check logs for the deferred_tools.list context message, which lists all deferred tool names.
Read DeferralState from the runtime snapshot to see the current mode of each tool:

use awaken_ext_deferred_tools::state::{DeferralState, DeferralStateValue};

let state: &DeferralStateValue = snapshot.state::<DeferralState>()
    .expect("DeferralState not found");

for (tool_id, mode) in &state.modes {
    println!("{tool_id}: {mode:?}");
}

Ask the LLM a question that requires a deferred tool. Confirm ToolSearch is called and the tool is promoted to Eager in subsequent turns.
After several turns of inactivity, verify re-deferral by checking that the tool reverts to Deferred mode in the snapshot.

Common Errors

Symptom	Cause	Fix
All tools sent to LLM (no deferral)	`enabled: Some(false)` or total savings below `beta_overhead`	Set `enabled: Some(true)` or add more tools so savings exceed overhead
Plugin registers but no tools deferred	All rules resolve to `Eager`	Set `default_mode: ToolLoadMode::Deferred` or add `Deferred` rules
ToolSearch not available to LLM	Plugin not registered	Register `DeferredToolsPlugin` with seed tool descriptors
Tools never re-deferred	`defer_after` too high or tool usage is frequent	Lower `defer_after` or increase `thresh_mult`
Cross-session priors not loading	No `ProfileStore` configured	Wire a profile store into the runtime
ToolSearch returns no results	Tool not in deferred registry	Check that the tool was in the `seed_tools` list passed to the plugin

Key Files

Path	Purpose
`crates/awaken-ext-deferred-tools/src/lib.rs`	Module root and public re-exports
`crates/awaken-ext-deferred-tools/src/config.rs`	`DeferredToolsConfig`, `DeferralRule`, `ToolLoadMode`, `DiscBetaParams`
`crates/awaken-ext-deferred-tools/src/plugin/plugin.rs`	`DeferredToolsPlugin` registration
`crates/awaken-ext-deferred-tools/src/plugin/hooks.rs`	Phase hooks (BeforeInference, AfterToolExecute, AfterInference, RunStart, RunEnd)
`crates/awaken-ext-deferred-tools/src/tool_search.rs`	`ToolSearchTool` implementation and query parsing
`crates/awaken-ext-deferred-tools/src/policy.rs`	`ConfigOnlyPolicy` and `DiscBetaEvaluator`
`crates/awaken-ext-deferred-tools/src/state.rs`	State keys: `DeferralState`, `DeferralRegistry`, `DiscBetaState`, `ToolUsageStats`, `AgentToolPriors`

Add a Plugin
Add a Tool
Optimize Context Window

Serve & Integrate

This path is for turning a local runtime into something other systems can call.

Start here

Expose HTTP SSE to put the runtime behind HTTP and streaming endpoints.
Integrate AI SDK Frontend for React clients that speak AI SDK v6.
Integrate CopilotKit (AG-UI) for CopilotKit frontends.

Reference pages to pair with this section

HTTP API
AI SDK v6 Protocol
AG-UI Protocol
A2A Protocol

Expose HTTP with SSE

Use this when you need to serve agents over HTTP with Server-Sent Events streaming, supporting multiple protocol adapters (AI SDK, AG-UI, A2A).

Prerequisites

awaken crate with the server feature enabled
tokio with rt-multi-thread and signal features
A built AgentRuntime

Steps

Add the dependency.

[dependencies]
awaken = { package = "awaken-agent", version = "...", features = ["server"] }
tokio = { version = "1", features = ["rt-multi-thread", "macros", "signal"] }

Build the runtime.

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::{AgentRuntimeBuilder, AgentSpec, ModelSpec};
use awaken::stores::InMemoryStore;

let store = Arc::new(InMemoryStore::new());

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(
        AgentSpec::new("assistant")
            .with_model("claude-sonnet")
            .with_system_prompt("You are a helpful assistant."),
    )
    .with_tool("search", Arc::new(SearchTool))
    .with_provider("anthropic", Arc::new(GenaiExecutor::new()))
    .with_model("claude-sonnet", ModelSpec {
        id: "claude-sonnet".into(),
        provider: "anthropic".into(),
        model: "claude-sonnet-4-20250514".into(),
    })
    .with_thread_run_store(store.clone())
    .build()?;

let runtime = Arc::new(runtime);

Create the application state.

use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::stores::InMemoryMailboxStore;

let mailbox_store = Arc::new(InMemoryMailboxStore::new());
let mailbox = Arc::new(Mailbox::new(
    runtime.clone(),
    mailbox_store,
    "default-consumer".to_string(),
    MailboxConfig::default(),
));

let state = AppState::new(
    runtime.clone(),
    mailbox,
    store,
    runtime.resolver_arc(),
    ServerConfig::default(),
);

ServerConfig::default() binds to 0.0.0.0:3000 with an SSE buffer size of 64.

Build the router.

use awaken::server::routes::build_router;

let app = build_router().with_state(state);

build_router registers all route groups:

/health – health check
/v1/threads – thread CRUD and messaging
/v1/runs and /v1/threads/:id/runs – run APIs
/v1/config/* and /v1/capabilities – config and capabilities APIs
Protocol adapters: AI SDK v6, AG-UI, A2A, MCP

Start the server.

let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
axum::serve(listener, app).await?;

Configure SSE buffer size.

let config = ServerConfig {
    address: "0.0.0.0:8080".into(),
    sse_buffer_size: 128,
    ..ServerConfig::default()
};

Increase sse_buffer_size if clients consume events slowly and you observe dropped messages.

Verify

curl http://localhost:3000/health

Should return 200 OK. Then create a thread and run:

curl -X POST http://localhost:3000/v1/threads \
  -H "Content-Type: application/json" \
  -d '{}'

Common Errors

Error	Cause	Fix
Address already in use	Port 3000 is occupied	Change the bind address in `ServerConfig` or `TcpListener::bind`
SSE stream closes immediately	Client does not support `text/event-stream`	Use `curl` with `--no-buffer` or an SSE-compatible client
Missing protocol routes	Feature flags not enabled	Ensure `server` feature is enabled on the `awaken` crate

crates/awaken-server/tests/run_api.rs – integration tests demonstrating thread creation, run execution, and SSE streaming.

Key Files

crates/awaken-server/src/app.rs – AppState, ServerConfig
crates/awaken-server/src/routes.rs – build_router and route definitions
crates/awaken-server/src/http_sse.rs – SSE response helpers
crates/awaken-server/src/mailbox.rs – Mailbox run queue

Build an Agent
Use File Store
Use Postgres Store

Integrate AI SDK Frontend

Use this when you have a Vercel AI SDK (v6) React frontend and need to connect it to an awaken agent server.

Prerequisites

A working awaken agent runtime (see First Agent)
Feature server enabled on the awaken crate
Node.js project with @ai-sdk/react installed

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["server"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
tracing-subscriber = "0.3"

Steps

Build the backend server.

use std::sync::Arc;

use awaken::engine::GenaiExecutor;
use awaken::contract::storage::ThreadRunStore;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use awaken::AgentRuntimeBuilder;
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::server::routes::build_router;

#[tokio::main]
async fn main() {
    tracing_subscriber::fmt().with_target(true).init();

    let agent_spec = AgentSpec::new("my-agent")
        .with_model("gpt-4o-mini")
        .with_system_prompt("You are a helpful assistant.")
        .with_max_rounds(10);

    let runtime = AgentRuntimeBuilder::new()
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model(
            "gpt-4o-mini",
            ModelSpec {
                id: "gpt-4o-mini".into(),
                provider: "openai".into(),
                model: "gpt-4o-mini".into(),
            },
        )
        .with_agent_spec(agent_spec)
        .build()
        .expect("failed to build runtime");
    let runtime = Arc::new(runtime);

    let store = Arc::new(InMemoryStore::new());
    let resolver = runtime.resolver_arc();

    let mailbox_store = Arc::new(InMemoryMailboxStore::new());
    let mailbox = Arc::new(Mailbox::new(
        runtime.clone(),
        mailbox_store as Arc<dyn awaken::contract::MailboxStore>,
        format!("ai-sdk:{}", std::process::id()),
        MailboxConfig::default(),
    ));

    let state = AppState::new(
        runtime,
        mailbox,
        store as Arc<dyn ThreadRunStore>,
        resolver,
        ServerConfig {
            address: "127.0.0.1:3000".into(),
            ..Default::default()
        },
    );

    let app = build_router().with_state(state);

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .expect("failed to bind");
    axum::serve(listener, app).await.expect("server crashed");
}

The server automatically registers AI SDK v6 routes at:

POST /v1/ai-sdk/chat – create a new run and stream events
GET /v1/ai-sdk/chat/:thread_id/stream – resume an existing stream by thread ID
GET /v1/ai-sdk/threads/:thread_id/stream – alias for thread-based resume
GET /v1/ai-sdk/threads/:id/messages – retrieve thread messages

Connect the React frontend.

Install the AI SDK React package:

npm install ai @ai-sdk/react

Use the useChat hook pointed at your awaken server:

import { useChat } from "@ai-sdk/react";

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: "http://localhost:3000/v1/ai-sdk/chat",
    id: "thread-1",
  });

  return (
    <div>
      {messages.map((m) => (
        <div key={m.id}>
          <strong>{m.role}:</strong> {m.content}
        </div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}

Run both sides.

# Terminal 1: backend
cargo run

# Terminal 2: frontend
npm run dev

Verify

Open the frontend in a browser.
Send a message.
Confirm that streaming text appears incrementally.
Check the backend logs for RunStart and RunFinish events.

Common Errors

Symptom	Cause	Fix
CORS error in browser	No CORS middleware	Add `tower-http` CORS layer to the axum router
`useChat` receives no events	Wrong endpoint URL	Confirm the `api` prop points to `/v1/ai-sdk/chat`
`stream closed unexpectedly`	SSE buffer overflow	Increase `sse_buffer_size` in `ServerConfig`
404 on `/v1/ai-sdk/chat`	Missing `server` feature	Enable `features = ["server"]` in `Cargo.toml`

examples/ai-sdk-starter/agent/src/main.rs

Key Files

Path	Purpose
`crates/awaken-server/src/protocols/ai_sdk_v6/http.rs`	AI SDK v6 route handlers
`crates/awaken-server/src/protocols/ai_sdk_v6/encoder.rs`	AI SDK v6 SSE event encoder
`crates/awaken-server/src/routes.rs`	Unified router builder
`crates/awaken-server/src/app.rs`	`AppState` and `ServerConfig`
`examples/ai-sdk-starter/agent/src/main.rs`	Backend entry for the AI SDK starter

Expose HTTP SSE
AI SDK v6 Protocol Reference
Integrate CopilotKit (AG-UI)

Integrate CopilotKit (AG-UI)

Use this when you have a CopilotKit React frontend and need to connect it to an awaken agent server via the AG-UI protocol.

Prerequisites

A working awaken agent runtime (see First Agent)
Feature server enabled on the awaken crate
Node.js project with @copilotkit/react-core and @copilotkit/react-ui installed

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["server"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
tracing-subscriber = "0.3"

Steps

Build the backend server.

use std::sync::Arc;

use awaken::engine::GenaiExecutor;
use awaken::contract::storage::ThreadRunStore;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use awaken::AgentRuntimeBuilder;
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::server::routes::build_router;

#[tokio::main]
async fn main() {
    tracing_subscriber::fmt().with_target(true).init();

    let agent_spec = AgentSpec::new("copilotkit-agent")
        .with_model("gpt-4o-mini")
        .with_system_prompt(
            "You are a CopilotKit-powered assistant. \
             Update shared state and suggest actions when appropriate.",
        )
        .with_max_rounds(10);

    let runtime = AgentRuntimeBuilder::new()
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model(
            "gpt-4o-mini",
            ModelSpec {
                id: "gpt-4o-mini".into(),
                provider: "openai".into(),
                model: "gpt-4o-mini".into(),
            },
        )
        .with_agent_spec(agent_spec)
        .build()
        .expect("failed to build runtime");
    let runtime = Arc::new(runtime);

    let store = Arc::new(InMemoryStore::new());
    let resolver = runtime.resolver_arc();

    let mailbox_store = Arc::new(InMemoryMailboxStore::new());
    let mailbox = Arc::new(Mailbox::new(
        runtime.clone(),
        mailbox_store as Arc<dyn awaken::contract::MailboxStore>,
        format!("copilotkit:{}", std::process::id()),
        MailboxConfig::default(),
    ));

    let state = AppState::new(
        runtime,
        mailbox,
        store as Arc<dyn ThreadRunStore>,
        resolver,
        ServerConfig {
            address: "127.0.0.1:3000".into(),
            ..Default::default()
        },
    );

    let app = build_router().with_state(state);

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .expect("failed to bind");
    axum::serve(listener, app).await.expect("server crashed");
}

The server automatically registers AG-UI routes at:

POST /v1/ag-ui/run – create a new run and stream AG-UI events
POST /v1/ag-ui/threads/:thread_id/runs – start a thread-scoped run
POST /v1/ag-ui/agents/:agent_id/runs – start an agent-scoped run
POST /v1/ag-ui/threads/:thread_id/interrupt – interrupt a running thread
GET /v1/ag-ui/threads/:id/messages – retrieve thread messages

Connect the CopilotKit frontend.

Install CopilotKit packages:

npm install @copilotkit/react-core @copilotkit/react-ui

Wrap your app with the CopilotKit provider:

import { CopilotKit } from "@copilotkit/react-core";
import { CopilotChat } from "@copilotkit/react-ui";
import "@copilotkit/react-ui/styles.css";

export default function App() {
  return (
    <CopilotKit runtimeUrl="http://localhost:3000/v1/ag-ui">
      <CopilotChat
        labels={{ title: "Agent", initial: "How can I help?" }}
      />
    </CopilotKit>
  );
}

Run both sides.

# Terminal 1: backend
cargo run

# Terminal 2: frontend
npm run dev

Verify

Open the frontend in a browser.
Send a message through the CopilotChat widget.
Confirm streaming text appears in the chat UI.
Check the backend logs for RunStart and RunFinish events.

Common Errors

Symptom	Cause	Fix
CORS error in browser	No CORS middleware	Add `tower-http` CORS layer to the axum router
CopilotKit shows “connection failed”	Wrong `runtimeUrl`	Confirm it points to `http://localhost:3000/v1/ag-ui`
Events arrive but UI does not update	AG-UI event format mismatch	Ensure CopilotKit version is compatible with AG-UI protocol
404 on `/v1/ag-ui/run`	Missing `server` feature	Enable `features = ["server"]` in `Cargo.toml`

examples/copilotkit-starter/agent/src/main.rs

Key Files

Path	Purpose
`crates/awaken-server/src/protocols/ag_ui/http.rs`	AG-UI route handlers
`crates/awaken-server/src/protocols/ag_ui/encoder.rs`	AG-UI SSE event encoder
`crates/awaken-server/src/routes.rs`	Unified router builder
`crates/awaken-server/src/app.rs`	`AppState` and `ServerConfig`
`examples/copilotkit-starter/agent/src/main.rs`	Backend entry for the CopilotKit starter

Expose HTTP SSE
AG-UI Protocol Reference
Integrate AI SDK Frontend

State & Storage

This path is for teams moving beyond stateless demos.

Use this section to decide

where thread and run data should live
how state is keyed and merged
how much context should reach the model each turn

Recommended order

Use File Store or Use Postgres Store to choose a persistence backend.
State Keys and Thread Model to understand state layout and lifecycle.
Optimize Context Window when context size starts to matter.

State and Snapshot Model
Run Lifecycle and Phases

Use File Store

Use this when you need file-based persistence for threads, runs, and messages without an external database.

Prerequisites

awaken-stores crate with the file feature enabled

Steps

Add the dependency.

[dependencies]
awaken-stores = { version = "...", features = ["file"] }

Or, if using the awaken facade crate (which re-exports awaken-stores), add awaken-stores directly for the feature flag:

[dependencies]
awaken = { package = "awaken-agent", version = "..." }
awaken-stores = { version = "...", features = ["file"] }

Create a FileStore.

use std::sync::Arc;
use awaken::stores::FileStore;

let store = Arc::new(FileStore::new("./data"));

The directory is created automatically on first write. The layout is:

./data/
  threads/<thread_id>.json
  messages/<thread_id>.json
  runs/<run_id>.json

Wire it into the runtime.

use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_thread_run_store(store)
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

Use an absolute path for production.

use std::path::PathBuf;

let data_dir = PathBuf::from("/var/lib/myapp/awaken");
let store = Arc::new(FileStore::new(data_dir));

Verify

Run the agent, then inspect the data directory. You should see JSON files under threads/, messages/, and runs/ corresponding to the thread and run IDs used.

Common Errors

Error	Cause	Fix
`StorageError::Io`	Permission denied on the data directory	Ensure the process has read/write access to the path
`StorageError::Io` with empty ID	Thread or run ID contains invalid characters (`/`, `\`, `..`)	Use simple alphanumeric or UUID-style IDs
Missing data after restart	Using a relative path that resolved differently	Use an absolute path

crates/awaken-stores/src/file.rs – FileStore implementation with filesystem layout details.

Key Files

crates/awaken-stores/Cargo.toml – feature flag definition
crates/awaken-stores/src/file.rs – FileStore
crates/awaken-stores/src/lib.rs – conditional re-export

Build an Agent
Use Postgres Store

Use Postgres Store

Use this when you need durable, multi-instance persistence backed by PostgreSQL.

Prerequisites

awaken-stores crate with the postgres feature enabled
A running PostgreSQL instance
sqlx runtime dependencies (tokio)

Steps

Add the dependency.

[dependencies]
awaken-stores = { version = "...", features = ["postgres"] }

Create a connection pool.

use sqlx::PgPool;

let pool = PgPool::connect("postgres://user:pass@localhost:5432/mydb").await?;

Create a PostgresStore.

use std::sync::Arc;
use awaken::stores::PostgresStore;

let store = Arc::new(PostgresStore::new(pool));

This uses default table names: awaken_threads, awaken_runs, awaken_messages.

Use a custom table prefix.

let store = Arc::new(PostgresStore::with_prefix(pool, "myapp"));

This creates tables named myapp_threads, myapp_runs, myapp_messages.

Wire it into the runtime.

use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_thread_run_store(store)
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

Schema creation.

Tables are auto-created on first access via ensure_schema(). Each table uses:

id TEXT PRIMARY KEY
data JSONB NOT NULL
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()

No manual migration is required for initial setup.

Verify

After running the agent, query the database:

SELECT id, updated_at FROM awaken_threads;
SELECT id, updated_at FROM awaken_runs;

You should see rows corresponding to the threads and runs created during execution.

Common Errors

Error	Cause	Fix
`sqlx::Error` connection refused	PostgreSQL is not running or the connection string is wrong	Verify the `DATABASE_URL` and that the database is accepting connections
`StorageError` on first write	Insufficient database privileges	Grant `CREATE TABLE` and `INSERT` permissions to the database user
Table name collision	Another application uses the same default table names	Use `PostgresStore::with_prefix` to namespace tables

crates/awaken-stores/src/postgres.rs – PostgresStore implementation with schema auto-creation.

Key Files

crates/awaken-stores/Cargo.toml – postgres feature flag and sqlx dependency
crates/awaken-stores/src/postgres.rs – PostgresStore
crates/awaken-stores/src/lib.rs – conditional re-export

Build an Agent
Use File Store

Optimize the Context Window

Use this when you need to control how the runtime manages conversation history to stay within model token limits.

Prerequisites

awaken crate added to Cargo.toml
An agent configured with AgentSpec

ContextWindowPolicy

Every agent has a ContextWindowPolicy that controls how conversation history is managed. Set it on your AgentSpec:

use awaken::ContextWindowPolicy;

let policy = ContextWindowPolicy {
    max_context_tokens: 200_000,
    max_output_tokens: 16_384,
    min_recent_messages: 10,
    enable_prompt_cache: true,
    autocompact_threshold: Some(100_000),
    compaction_mode: ContextCompactionMode::KeepRecentRawSuffix,
    compaction_raw_suffix_messages: 2,
};

Fields

Field	Type	Default	Description
`max_context_tokens`	`usize`	`200_000`	Model’s total context window size in tokens
`max_output_tokens`	`usize`	`16_384`	Tokens reserved for model output
`min_recent_messages`	`usize`	`10`	Minimum number of recent messages to always preserve, even if over budget
`enable_prompt_cache`	`bool`	`true`	Whether to enable prompt caching
`autocompact_threshold`	`Option<usize>`	`None`	Token count that triggers auto-compaction. `None` disables auto-compaction
`compaction_mode`	`ContextCompactionMode`	`KeepRecentRawSuffix`	Strategy used when auto-compaction fires
`compaction_raw_suffix_messages`	`usize`	`2`	Number of recent raw messages to preserve in suffix compaction mode

Truncation

When the conversation exceeds the available token budget, the runtime automatically drops the oldest messages to fit. The budget is calculated as:

available = max_context_tokens - max_output_tokens - tool_schema_tokens

What truncation preserves

System messages are never truncated. All system messages at the start of the history survive regardless of budget.
Recent messages – at least min_recent_messages history messages are kept, even if they exceed the budget.
Tool call/result pairs – the split point is adjusted so that an assistant message with tool calls is never separated from its corresponding tool result messages.
Dangling tool calls – after truncation, any orphaned tool calls (whose results were dropped) are patched to prevent invalid message sequences.

Artifact compaction

Before truncation runs, oversized tool results are compacted automatically. A tool result whose text exceeds ARTIFACT_COMPACT_THRESHOLD_TOKENS (2048 tokens, estimated at ~8192 characters) is truncated to a preview of at most 1600 characters or 24 lines, whichever is shorter. The preview includes a compaction indicator showing the original size.

Non-tool messages (system, user, assistant) are never subject to artifact compaction.

Compaction

Compaction summarizes older conversation history into a condensed summary message, reducing token usage while preserving context. Unlike truncation (which drops messages), compaction replaces them with a summary.

Enabling auto-compaction

Set autocompact_threshold to trigger compaction when total message tokens exceed that value:

let policy = ContextWindowPolicy {
    autocompact_threshold: Some(100_000),
    compaction_mode: ContextCompactionMode::KeepRecentRawSuffix,
    compaction_raw_suffix_messages: 4,
    ..Default::default()
};

ContextCompactionMode

Two strategies are available:

KeepRecentRawSuffix (default) – keeps the most recent compaction_raw_suffix_messages messages as raw history. Everything before the compaction boundary is summarized.
CompactToSafeFrontier – compacts all messages up to the safe frontier (the latest point where all tool call/result pairs are complete).

The compaction boundary is chosen so that no tool call is separated from its result. The boundary finder walks the message history and only places boundaries where all open tool calls have been resolved.

CompactionConfig

The compaction subsystem is configured through CompactionConfig, stored in the agent spec’s sections["compaction"] and read via CompactionConfigKey:

use awaken::CompactionConfig;

let config = CompactionConfig {
    summarizer_system_prompt: "You are a conversation summarizer. \
        Preserve all key facts, decisions, tool results, and action items. \
        Be concise but complete.".into(),
    summarizer_user_prompt: "Summarize the following conversation:\n\n{messages}".into(),
    summary_max_tokens: Some(1024),
    summary_model: Some("claude-3-haiku".into()),
    min_savings_ratio: 0.3,
};

Field	Type	Default	Description
`summarizer_system_prompt`	`String`	Conversation summarizer prompt	System prompt for the summarizer LLM call
`summarizer_user_prompt`	`String`	`"Summarize...\n\n{messages}"`	User prompt template; `{messages}` is replaced with the conversation transcript
`summary_max_tokens`	`Option<u32>`	`None`	Maximum tokens for the summary response
`summary_model`	`Option<String>`	`None`	Model for summarization (defaults to the agent’s model)
`min_savings_ratio`	`f64`	`0.3`	Minimum token savings ratio (0.0-1.0) to accept a compaction

The compaction pass only runs when the expected savings ratio exceeds min_savings_ratio. A minimum gain of 1024 tokens (MIN_COMPACTION_GAIN_TOKENS) is also required to justify the summarization LLM call.

DefaultSummarizer

The built-in DefaultSummarizer reads prompts from CompactionConfig and supports cumulative summarization. When a previous summary exists, it asks the LLM to update the existing summary with new conversation rather than re-summarizing everything from scratch.

The transcript renderer filters out Visibility::Internal messages before sending to the summarizer, since system-injected context is re-injected each turn and should not be included in summaries.

Summary storage

Compaction summaries are stored as <conversation-summary> tagged internal system messages. On load, trim_to_compaction_boundary drops all messages before the latest summary message, so already-summarized history is never re-loaded into the context window.

Compaction boundaries are tracked durably via CompactionState, recording the summary text, pre/post token counts, and timestamp for each compaction event.

Truncation recovery

When the LLM stops due to MaxTokens with incomplete tool calls (argument JSON was truncated mid-generation), the runtime can automatically retry by injecting a continuation prompt asking the model to break its work into smaller pieces and continue. The retry count is tracked by TruncationState and bounded by a configurable maximum.

Key Files

crates/awaken-contract/src/contract/inference.rs – ContextWindowPolicy, ContextCompactionMode
crates/awaken-runtime/src/context/transform/mod.rs – ContextTransform (truncation)
crates/awaken-runtime/src/context/transform/compaction.rs – artifact compaction
crates/awaken-runtime/src/context/compaction.rs – boundary finding, load-time trimming
crates/awaken-runtime/src/context/summarizer.rs – ContextSummarizer, DefaultSummarizer
crates/awaken-runtime/src/context/plugin.rs – CompactionPlugin, CompactionConfig, CompactionState
crates/awaken-runtime/src/context/truncation.rs – TruncationState, continuation prompts

Build an Agent
Add a Plugin
State Keys

State Keys

The state system provides typed, scoped, persistent key-value storage for agent runs. Plugins and tools define state keys at compile time; the runtime manages snapshots, persistence, and parallel merge semantics.

StateKey trait

Every state slot is identified by a type implementing StateKey.

pub trait StateKey: 'static + Send + Sync {
    /// Unique string identifier for serialization.
    const KEY: &'static str;

    /// Parallel merge strategy. Default: `Exclusive`.
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;

    /// Lifetime scope. Default: `Run`.
    const SCOPE: KeyScope = KeyScope::Run;

    /// The stored value type.
    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;

    /// The update command type.
    type Update: Send + 'static;

    /// Apply an update to the current value.
    fn apply(value: &mut Self::Value, update: Self::Update);

    /// Serialize value to JSON. Default uses serde_json.
    fn encode(value: &Self::Value) -> Result<JsonValue, StateError>;

    /// Deserialize value from JSON. Default uses serde_json.
    fn decode(value: JsonValue) -> Result<Self::Value, StateError>;
}

Crate path: awaken::state::StateKey (re-exported at awaken::StateKey)

Example

struct Counter;

impl StateKey for Counter {
    const KEY: &'static str = "counter";
    type Value = usize;
    type Update = usize;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

KeyScope

Controls when a key’s value is cleared relative to run boundaries.

pub enum KeyScope {
    /// Cleared at run start (default).
    Run,
    /// Persists across runs on the same thread.
    Thread,
}

MergeStrategy

Determines how concurrent updates to the same key are handled when merging MutationBatches from parallel tool execution.

pub enum MergeStrategy {
    /// Concurrent writes to this key conflict. Parallel batches that both
    /// touch this key cannot be merged.
    Exclusive,
    /// Updates are commutative -- they can be applied in any order. Parallel
    /// batches that both touch this key will have their ops concatenated.
    Commutative,
}

StateMap

A type-erased container for state values. Backed by TypedMap from the typedmap crate.

pub struct StateMap { /* ... */ }

Methods

/// Check if a key is present.
fn contains<K: StateKey>(&self) -> bool

/// Get a reference to the value.
fn get<K: StateKey>(&self) -> Option<&K::Value>

/// Get a mutable reference to the value.
fn get_mut<K: StateKey>(&mut self) -> Option<&mut K::Value>

/// Insert a value, replacing any existing one.
fn insert<K: StateKey>(&mut self, value: K::Value)

/// Remove and return the value.
fn remove<K: StateKey>(&mut self) -> Option<K::Value>

/// Get a mutable reference, inserting `Default::default()` if absent.
fn get_or_insert_default<K: StateKey>(&mut self) -> &mut K::Value

StateMap implements Clone and Default.

Snapshot

An immutable, versioned view of the state map. Passed to tools via ToolCallContext.

pub struct Snapshot {
    pub revision: u64,
    pub ext: Arc<StateMap>,
}

Methods

fn new(revision: u64, ext: Arc<StateMap>) -> Self
fn revision(&self) -> u64
fn get<K: StateKey>(&self) -> Option<&K::Value>
fn ext(&self) -> &StateMap

Type alias: Snapshot is not a type alias; it is a concrete struct that wraps Arc<StateMap>.

StateKeyOptions

Declarative options used when registering a state key with the runtime.

pub struct StateKeyOptions {
    /// Whether the key is persisted to the store. Default: `true`.
    pub persistent: bool,
    /// Whether the key survives plugin uninstall. Default: `false`.
    pub retain_on_uninstall: bool,
    /// Lifetime scope. Default: `KeyScope::Run`.
    pub scope: KeyScope,
}

PersistedState

Serialized form of the state map used by storage backends.

pub struct PersistedState {
    pub revision: u64,
    pub extensions: HashMap<String, JsonValue>,
}

Shared State (ProfileKey + StateScope)

For cross-boundary persistent state with dynamic scoping. Shared state uses ProfileKey – the same trait used for profile data – combined with a key: &str parameter for the runtime key dimension. Unlike StateKey, shared state is accessed asynchronously through ProfileAccess and does not participate in the snapshot/mutation workflow.

pub trait ProfileKey: 'static + Send + Sync {
    /// Namespace identifier (used as the storage namespace).
    const KEY: &'static str;

    /// The value type stored under this key.
    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;

    /// Serialize value to JSON.
    fn encode(value: &Self::Value) -> Result<JsonValue, StateError>;

    /// Deserialize value from JSON.
    fn decode(value: JsonValue) -> Result<Self::Value, StateError>;
}

The two dimensions for both shared and profile state are:

Namespace (ProfileKey::KEY) – compile-time, binds to a Value type
Key (&str parameter) – runtime, identifies which instance

For shared state, the key is typically a StateScope string; for profile state, it is an agent name or "system".

Crate path: awaken_contract::ProfileKey (re-exported from awaken_contract::contract::profile_store)

StateScope

pub struct StateScope(String);

Optional convenience builder for common key string patterns. Constructors:

Method	Produced String	Use Case
`StateScope::global()`	`"global"`	System-wide shared state
`StateScope::parent_thread(id)`	`"parent_thread::{id}"`	Parent-child agent sharing
`StateScope::agent_type(name)`	`"agent_type::{name}"`	Per-agent-type sharing
`StateScope::thread(id)`	`"thread::{id}"`	Thread-local persistent state
`StateScope::new(s)`	`"{s}"`	Arbitrary grouping

Call .as_str() to get the key string. Users can also pass any raw &str directly.

ProfileAccess Methods

ProfileAccess methods take key: &str for the runtime key dimension. The same methods serve both shared and profile state:

impl ProfileAccess {
    async fn read<K: ProfileKey>(&self, key: &str) -> Result<K::Value, StorageError>;
    async fn write<K: ProfileKey>(&self, key: &str, value: &K::Value) -> Result<(), StorageError>;
    async fn delete<K: ProfileKey>(&self, key: &str) -> Result<(), StorageError>;
}

// Shared state usage:
let scope = StateScope::global();
let value = access.read::<MyKey>(scope.as_str()).await?;
access.write::<MyKey>(scope.as_str(), &value).await?;

// Profile state usage:
let locale = access.read::<Locale>("alice").await?;
access.write::<Locale>("system", &"en-US".into()).await?;

Registration

impl PluginRegistrar {
    /// Register a profile key (used for both profile state and shared state).
    pub fn register_profile_key<K: ProfileKey>(&mut self) -> Result<(), StateError>;
}

State and Snapshot Model
Use Shared State

Thread Model

Threads represent persistent conversations. A thread holds metadata only; messages are stored and loaded separately through the ThreadStore trait.

Thread

pub struct Thread {
    /// Unique thread identifier (UUID v7).
    pub id: String,
    /// Thread metadata.
    pub metadata: ThreadMetadata,
}

Crate path: awaken::contract::thread::Thread (re-exported from awaken-contract)

Constructors

/// Create with a generated UUID v7 identifier.
fn new() -> Self

/// Create with a specific identifier.
fn with_id(id: impl Into<String>) -> Self

Builder methods

fn with_title(self, title: impl Into<String>) -> Self

Thread implements Default (delegates to Thread::new()), Clone, Serialize, and Deserialize.

ThreadMetadata

pub struct ThreadMetadata {
    /// Creation timestamp (unix millis).
    pub created_at: Option<u64>,
    /// Last update timestamp (unix millis).
    pub updated_at: Option<u64>,
    /// Optional thread title.
    pub title: Option<String>,
    /// Custom metadata key-value pairs.
    pub custom: HashMap<String, Value>,
}

All Option fields are omitted from JSON when None. The custom map is omitted when empty.

ThreadMetadata implements Default, Clone, Serialize, and Deserialize.

Storage

Messages are not stored inside the Thread struct. They are managed through the ThreadStore trait:

#[async_trait]
pub trait ThreadStore: Send + Sync {
    async fn load_thread(&self, thread_id: &str) -> Result<Option<Thread>, StorageError>;
    async fn save_thread(&self, thread: &Thread) -> Result<(), StorageError>;
    async fn delete_thread(&self, thread_id: &str) -> Result<(), StorageError>;
    async fn list_threads(&self, offset: usize, limit: usize) -> Result<Vec<String>, StorageError>;
    async fn load_messages(&self, thread_id: &str) -> Result<Option<Vec<Message>>, StorageError>;
    async fn save_messages(&self, thread_id: &str, messages: &[Message]) -> Result<(), StorageError>;
    async fn delete_messages(&self, thread_id: &str) -> Result<(), StorageError>;
    async fn update_thread_metadata(&self, id: &str, metadata: ThreadMetadata) -> Result<(), StorageError>;
}

ThreadRunStore

Extends ThreadStore + RunStore with an atomic checkpoint operation.

#[async_trait]
pub trait ThreadRunStore: ThreadStore + RunStore + Send + Sync {
    /// Persist thread messages and run record atomically.
    async fn checkpoint(
        &self,
        thread_id: &str,
        messages: &[Message],
        run: &RunRecord,
    ) -> Result<(), StorageError>;
}

RunStore

Run record persistence for tracking run history and enabling resume.

#[async_trait]
pub trait RunStore: Send + Sync {
    async fn create_run(&self, record: &RunRecord) -> Result<(), StorageError>;
    async fn load_run(&self, run_id: &str) -> Result<Option<RunRecord>, StorageError>;
    async fn latest_run(&self, thread_id: &str) -> Result<Option<RunRecord>, StorageError>;
    async fn list_runs(&self, query: &RunQuery) -> Result<RunPage, StorageError>;
}

RunRecord

pub struct RunRecord {
    pub run_id: String,
    pub thread_id: String,
    pub agent_id: String,
    pub parent_run_id: Option<String>,
    pub status: RunStatus,
    pub termination_code: Option<String>,
    pub created_at: u64,        // unix seconds
    pub updated_at: u64,        // unix seconds
    pub steps: usize,
    pub input_tokens: u64,
    pub output_tokens: u64,
    pub state: Option<PersistedState>,
}

Use File Store
Use Postgres Store

Operate

This path is for hardening an agent service once the happy path already works.

Recommended order

Enable Observability to make runs, tools, and providers visible.
Enable Tool Permission HITL to add approval control over tool execution.
Configure Stop Policies to keep agent loops bounded and predictable.
Report Tool Progress and Testing Strategy to improve operator visibility and confidence.

Enable Tool Permission HITL

Use this when you need to control which tools an agent can invoke, with human-in-the-loop approval for sensitive operations.

Prerequisites

A working awaken agent runtime (see First Agent)
Feature permission enabled on the awaken crate (enabled by default)

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["permission"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_permission::PermissionPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let agent_spec = AgentSpec::new("my-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("permission");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("permission", Arc::new(PermissionPlugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

The plugin registers a BeforeToolExecute phase hook that evaluates permission rules before every tool call.

Define permission rules inline.

use awaken::ext_permission::{PermissionRulesConfig, PermissionRuleEntry, ToolPermissionBehavior};

let config = PermissionRulesConfig {
    default_behavior: ToolPermissionBehavior::Ask,
    rules: vec![
        PermissionRuleEntry {
            tool: "read_file".into(),
            behavior: ToolPermissionBehavior::Allow,
            scope: Default::default(),
        },
        PermissionRuleEntry {
            tool: "file_*".into(),
            behavior: ToolPermissionBehavior::Ask,
            scope: Default::default(),
        },
        PermissionRuleEntry {
            tool: "delete_*".into(),
            behavior: ToolPermissionBehavior::Deny,
            scope: Default::default(),
        },
    ],
};

let ruleset = config.into_ruleset().expect("invalid rules");

Load rules from a YAML file (alternative).

Create permissions.yaml:

default_behavior: ask
rules:
  - tool: "read_file"
    behavior: allow
  - tool: "Bash(npm *)"
    behavior: allow
  - tool: "file_*"
    behavior: ask
  - tool: "delete_*"
    behavior: deny
  - tool: "mcp__*"
    behavior: ask

Load it in code:

use awaken::ext_permission::PermissionRulesConfig;

let config = PermissionRulesConfig::from_file("permissions.yaml")
    .expect("failed to load permissions");
let ruleset = config.into_ruleset().expect("invalid rules");

Configure via agent spec.

Permission rules can also be embedded in the agent spec using the permission plugin config key:

use awaken::registry_spec::AgentSpec;

let agent_spec = AgentSpec::new("my-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("permission");

Understand rule evaluation.

Rules are evaluated with firewall-like priority:
Deny – highest priority, blocks the tool call immediately
Allow – permits the tool call without user interaction
Ask – suspends the tool call and waits for human approval

The pattern syntax supports:

Pattern	Matches
`read_file`	Exact tool name
`file_*`	Glob on tool name
`mcp__github__*`	Glob for MCP-prefixed tools
`Bash(npm *)`	Tool name with primary argument glob
`Edit(file_path ~ "src/**")`	Named field glob
`Bash(command =~ "(?i)rm")`	Named field regex
`/mcp__(gh\|gl)__.*/`	Regex tool name

Verify

Register a tool that matches a deny rule and attempt to invoke it.
The tool call should be blocked and a ToolInterceptAction event emitted.
Register a tool matching an ask rule. The run should suspend, waiting for human approval via the mailbox.
Send approval through the mailbox endpoint and confirm the run resumes.

Common Errors

Symptom	Cause	Fix
All tools blocked	`default_behavior: deny` with no `allow` rules	Add explicit `allow` rules for safe tools
Rules not evaluated	Plugin not registered	Register `PermissionPlugin` and add `"permission"` to `plugin_ids` in agent spec
Invalid pattern error	Malformed glob or regex	Check syntax against the pattern table above
Ask rule never resolves	No mailbox consumer	Wire up a frontend or API client to respond to mailbox items

crates/awaken-ext-permission/tests/

Key Files

Path	Purpose
`crates/awaken-ext-permission/src/lib.rs`	Module root and public re-exports
`crates/awaken-ext-permission/src/config.rs`	`PermissionRulesConfig` and YAML/JSON loading
`crates/awaken-ext-permission/src/rules.rs`	Pattern syntax, `ToolPermissionBehavior`, rule evaluation
`crates/awaken-ext-permission/src/plugin/plugin.rs`	`PermissionPlugin` registration
`crates/awaken-ext-permission/src/plugin/checker.rs`	`PermissionInterceptHook` (BeforeToolExecute)
`crates/awaken-tool-pattern/`	Shared glob/regex pattern matching library

Add a Plugin
HITL and Mailbox
Tool Trait Reference

Configure Stop Policies

Use this when you need to control when an agent run terminates based on round count, token usage, elapsed time, or error frequency.

Prerequisites

awaken crate added to Cargo.toml
Familiarity with Plugin and AgentRuntimeBuilder

Overview

Stop policies evaluate after each inference step and decide whether the run should continue or terminate. The system provides four built-in policies and a trait for custom implementations.

Built-in policies:

Policy	Triggers when
`MaxRoundsPolicy`	Step count exceeds `max` rounds
`TokenBudgetPolicy`	Total tokens (input + output) exceed `max_total`
`TimeoutPolicy`	Elapsed wall time exceeds `max_ms` milliseconds
`ConsecutiveErrorsPolicy`	Consecutive inference errors reach `max`

Steps

Create policies programmatically.

use std::sync::Arc;
use awaken::policies::{
    MaxRoundsPolicy, TokenBudgetPolicy, TimeoutPolicy,
    ConsecutiveErrorsPolicy, StopPolicy,
};

let policies: Vec<Arc<dyn StopPolicy>> = vec![
    Arc::new(MaxRoundsPolicy::new(25)),
    Arc::new(TokenBudgetPolicy::new(100_000)),
    Arc::new(TimeoutPolicy::new(300_000)), // 5 minutes in ms
    Arc::new(ConsecutiveErrorsPolicy::new(3)),
];

use awaken::policies::StopConditionPlugin;
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("stop-condition", Arc::new(StopConditionPlugin::new(policies)))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

For the common case of limiting rounds only, use the convenience wrapper:

use awaken::policies::MaxRoundsPlugin;

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("stop-condition:max-rounds", Arc::new(MaxRoundsPlugin::new(10)))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

Use declarative StopConditionSpec values.

The policies_from_specs function converts declarative specs into policy instances. This is useful when loading configuration from JSON or YAML.

use awaken_contract::contract::lifecycle::StopConditionSpec;
use awaken::policies::{policies_from_specs, StopConditionPlugin};

let specs = vec![
    StopConditionSpec::MaxRounds { rounds: 10 },
    StopConditionSpec::Timeout { seconds: 300 },
    StopConditionSpec::TokenBudget { max_total: 100_000 },
    StopConditionSpec::ConsecutiveErrors { max: 3 },
];

let policies = policies_from_specs(&specs);
let plugin = StopConditionPlugin::new(policies);

The full set of StopConditionSpec variants:

pub enum StopConditionSpec {
    MaxRounds { rounds: usize },
    Timeout { seconds: u64 },
    TokenBudget { max_total: usize },
    ConsecutiveErrors { max: usize },
    StopOnTool { tool_name: String },      // not yet implemented
    ContentMatch { pattern: String },       // not yet implemented
    LoopDetection { window: usize },        // not yet implemented
}

StopOnTool, ContentMatch, and LoopDetection are defined in the contract but not yet backed by policy implementations. policies_from_specs silently skips unimplemented variants.

The StopPolicy Trait

Implement StopPolicy to create custom stop conditions:

use awaken::policies::{StopPolicy, StopDecision, StopPolicyStats};

pub struct MyCustomPolicy {
    pub threshold: u64,
}

impl StopPolicy for MyCustomPolicy {
    fn id(&self) -> &str {
        "my_custom"
    }

    fn evaluate(&self, stats: &StopPolicyStats) -> StopDecision {
        if stats.total_output_tokens > self.threshold {
            StopDecision::Stop {
                code: "my_custom".into(),
                detail: format!("output tokens {} exceeded {}", stats.total_output_tokens, self.threshold),
            }
        } else {
            StopDecision::Continue
        }
    }
}

The trait requires Send + Sync + 'static. Evaluation must be synchronous – it is pure computation on the provided stats.

StopPolicyStats

Every policy receives a StopPolicyStats snapshot with fields populated by the internal StopConditionHook:

Field	Type	Description
`step_count`	`u32`	Number of inference steps completed so far
`total_input_tokens`	`u64`	Cumulative prompt tokens across all steps
`total_output_tokens`	`u64`	Cumulative completion tokens across all steps
`elapsed_ms`	`u64`	Wall time since the first step, in milliseconds
`consecutive_errors`	`u32`	Current streak of consecutive inference errors (resets on success)
`last_tool_names`	`Vec<String>`	Tool names called in the most recent inference response
`last_response_text`	`String`	Text content of the most recent inference response

StopDecision

pub enum StopDecision {
    Continue,
    Stop { code: String, detail: String },
}

When any policy returns StopDecision::Stop, the hook converts it to TerminationReason::Stopped with the given code and detail, then updates the run lifecycle to Done. The agent loop exits after the current step. Policies are evaluated in order; the first Stop wins.

How Stop Policies Interact with the Agent Loop

The StopConditionPlugin registers a PhaseHook on Phase::AfterInference.
After each LLM inference, the hook increments step_count, accumulates token usage, and tracks consecutive errors in StopConditionStatsState.
The hook builds a StopPolicyStats snapshot and calls evaluate on each registered policy.
If any policy returns Stop, the hook emits a RunLifecycleUpdate::Done state command with the stop code, which terminates the run.
If all policies return Continue, the agent loop proceeds to the next step.

A policy with max or max_total set to 0 is treated as disabled and always returns Continue.

Common Errors

Error	Cause	Fix
Run never stops	No stop policy registered and LLM keeps calling tools	Register at least `MaxRoundsPolicy` or `MaxRoundsPlugin`
`StateError::KeyAlreadyRegistered`	Both `StopConditionPlugin` and `MaxRoundsPlugin` registered	Use only one; they share the same state key
Timeout fires too early	`TimeoutPolicy` takes milliseconds, `StopConditionSpec::Timeout` takes seconds	When using `TimeoutPolicy::new()` directly, pass milliseconds

Key Files

crates/awaken-runtime/src/policies/mod.rs – module root and public exports
crates/awaken-runtime/src/policies/policy.rs – StopPolicy trait, built-in policies, policies_from_specs
crates/awaken-runtime/src/policies/plugin.rs – StopConditionPlugin and MaxRoundsPlugin
crates/awaken-runtime/src/policies/state.rs – StopConditionStatsState and its state key
crates/awaken-runtime/src/policies/hook.rs – internal StopConditionHook that drives evaluation
crates/awaken-contract/src/contract/lifecycle.rs – StopConditionSpec enum

Add a Plugin
Build an Agent

Enable Observability

Use this when you need to trace LLM inference calls and tool executions with OpenTelemetry-compatible telemetry.

Prerequisites

A working awaken agent runtime (see First Agent)
Feature observability enabled on the awaken crate (enabled by default)
For OTel export: feature otel enabled on awaken-ext-observability, plus a configured OTel collector

[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["observability"] }
tokio = { version = "1", features = ["full"] }

Steps

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, InMemorySink};
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let sink = InMemorySink::new();
let obs_plugin = ObservabilityPlugin::new(sink.clone());
let agent_spec = AgentSpec::new("observed-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("observability");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

After a run completes, inspect collected metrics:

let metrics = sink.metrics();
println!("inferences: {}", metrics.inferences.len());
println!("tool calls: {}", metrics.tools.len());
println!("total input tokens: {}", metrics.total_input_tokens());
println!("total output tokens: {}", metrics.total_output_tokens());
println!("tool failures: {}", metrics.tool_failures());
println!("session duration: {}ms", metrics.session_duration_ms);

for stat in metrics.stats_by_model() {
    println!("{}: {} calls, {} in / {} out tokens",
        stat.model, stat.inference_count, stat.input_tokens, stat.output_tokens);
}

for stat in metrics.stats_by_tool() {
    println!("{}: {} calls, {} failures",
        stat.name, stat.call_count, stat.failure_count);
}

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, OtelMetricsSink};
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
use opentelemetry_sdk::trace::SdkTracerProvider;

let provider = SdkTracerProvider::builder()
    // configure your exporter (OTLP, Jaeger, etc.)
    .build();
let tracer = provider.tracer("awaken");

let obs_plugin = ObservabilityPlugin::new(OtelMetricsSink::new(tracer));
let agent_spec = AgentSpec::new("observed-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("observability");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

Implement a custom sink (optional).

use awaken::ext_observability::{MetricsSink, GenAISpan, ToolSpan, AgentMetrics};

struct MySink;

impl MetricsSink for MySink {
    fn on_inference(&self, span: &GenAISpan) {
        // forward to your metrics system
    }

    fn on_tool(&self, span: &ToolSpan) {
        // forward to your metrics system
    }

    fn on_run_end(&self, metrics: &AgentMetrics) {
        // emit summary metrics
    }
}

Captured telemetry.

The plugin hooks into the following phases:

Phase	Data Captured
`RunStart`	Session start timestamp
`BeforeInference`	Inference start timestamp, model, provider
`AfterInference`	Token usage, finish reasons, duration, cache tokens
`BeforeToolExecute`	Tool call start timestamp
`AfterToolExecute`	Tool duration, error status
`RunEnd`	Session duration

OTel spans follow GenAI semantic conventions with attributes such as gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens.

Verify

Run an agent with the InMemorySink.
After run() completes, call sink.metrics().
Confirm inferences is non-empty and token counts are populated.
For OTel, check your collector or Jaeger UI for spans named with the awaken tracer.

Common Errors

Symptom	Cause	Fix
Metrics are all zero	Plugin not registered	Register `ObservabilityPlugin` with the runtime builder
`OtelMetricsSink` not found	Missing `otel` feature	Enable the `otel` feature on `awaken-ext-observability`
No spans in collector	Exporter not configured	Verify `SdkTracerProvider` has an exporter and is not dropped
Token counts missing	LLM provider does not report usage	Check that your `LlmExecutor` returns `TokenUsage` in `LLMResponse`

crates/awaken-ext-observability/tests/

Key Files

Path	Purpose
`crates/awaken-ext-observability/src/lib.rs`	Module root and public re-exports
`crates/awaken-ext-observability/src/plugin/plugin.rs`	`ObservabilityPlugin` registration
`crates/awaken-ext-observability/src/plugin/hooks.rs`	Phase hooks for each telemetry point
`crates/awaken-ext-observability/src/metrics.rs`	`AgentMetrics`, `GenAISpan`, `ToolSpan` types
`crates/awaken-ext-observability/src/sink.rs`	`MetricsSink` trait and `InMemorySink`
`crates/awaken-ext-observability/src/otel.rs`	`OtelMetricsSink` with GenAI semantic conventions

Add a Plugin
Events
Architecture

Report Tool Progress

Use this when you need to stream progress updates or activity snapshots from a tool back to the frontend during execution.

Prerequisites

A Tool implementation with access to ToolCallContext
awaken crate added to Cargo.toml

Steps

Report structured progress with report_progress.

Call report_progress inside your Tool::execute method to emit a ToolCallProgressState snapshot. The runtime wraps it in an AgentEvent::ActivitySnapshot with activity_type = "tool-call-progress".

use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use awaken::contract::progress::ProgressStatus;
use async_trait::async_trait;
use serde_json::{Value, json};

struct IndexTool;

#[async_trait]
impl Tool for IndexTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("index_docs", "Index Docs", "Index a document set")
    }

    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        ctx.report_progress(ProgressStatus::Running, Some("Starting indexing"), None).await;

        for i in 0..10 {
            // ... do work ...
            let fraction = (i + 1) as f64 / 10.0;
            ctx.report_progress(
                ProgressStatus::Running,
                Some(&format!("Indexed batch {}/10", i + 1)),
                Some(fraction),
            ).await;
        }

        ctx.report_progress(ProgressStatus::Done, Some("Indexing complete"), Some(1.0)).await;
        Ok(ToolResult::success("index_docs", json!({"indexed": 10})).into())
    }
}

The progress parameter is a normalized f64 between 0.0 and 1.0. Pass None when progress is indeterminate.

Report custom activity snapshots with report_activity.

Use report_activity for full-state activity updates that are not structured progress. The runtime emits an AgentEvent::ActivitySnapshot with replace: Some(true).

ctx.report_activity("code-generation", "fn hello() {\n    println!(\"hi\");\n}").await;

The activity_type string identifies the kind of activity. Frontends use it to choose how to render the content.

Report incremental activity deltas with report_activity_delta.

Use report_activity_delta to send a JSON patch instead of replacing the full content. The runtime emits an AgentEvent::ActivityDelta. If you pass a single JSON value, it is wrapped in an array automatically.

use serde_json::json;

ctx.report_activity_delta(
    "code-generation",
    json!([
        { "op": "add", "path": "/line", "value": "    println!(\"world\");" }
    ]),
).await;

Use ProgressStatus variants to reflect the tool call lifecycle.

Variant	Meaning
`Pending`	Tool call is queued but has not started
`Running`	Tool call is actively executing
`Done`	Tool call completed successfully
`Failed`	Tool call encountered an error
`Cancelled`	Tool call was cancelled before completion

ProgressStatus serializes to snake_case strings ("pending", "running", "done", "failed", "cancelled").

How it works

report_progress builds a ToolCallProgressState and emits it as an AgentEvent::ActivitySnapshot:

pub struct ToolCallProgressState {
    pub schema: String,           // "tool-call-progress.v1"
    pub node_id: String,          // set to call_id
    pub call_id: String,
    pub tool_name: String,
    pub status: ProgressStatus,
    pub progress: Option<f64>,    // 0.0 - 1.0, None if indeterminate
    pub loaded: Option<u64>,      // absolute loaded count
    pub total: Option<u64>,       // absolute total count
    pub message: Option<String>,
    pub parent_node_id: Option<String>,
    pub parent_call_id: Option<String>,
}

The activity type is the constant TOOL_CALL_PROGRESS_ACTIVITY_TYPE, which equals "tool-call-progress". Optional fields (progress, loaded, total, message, parent_node_id, parent_call_id) are omitted from JSON when None.

All three reporting methods are no-ops when the context has no activity_sink configured.

Verify

Subscribe to the SSE event stream and confirm that ActivitySnapshot events with activity_type = "tool-call-progress" appear while the tool executes, and that the status field transitions from "running" to "done".

Common Errors

Error	Cause	Fix
No events appear	`activity_sink` is `None` on the context	Ensure the runtime is configured with an event sink
Progress not updating in frontend	Frontend does not handle `ActivitySnapshot` with this activity type	Filter on `activity_type == "tool-call-progress"`

Key Files

crates/awaken-contract/src/contract/progress.rs – ToolCallProgressState, ProgressStatus, TOOL_CALL_PROGRESS_ACTIVITY_TYPE
crates/awaken-contract/src/contract/tool.rs – ToolCallContext::report_progress, report_activity, report_activity_delta

Tool Trait Reference
Events Reference
Add a Tool

Testing Strategy

Use this when you need to test tools, plugins, state keys, or full agent runs without depending on a live LLM.

Prerequisites

awaken crate added to Cargo.toml (with the runtime re-exports)
tokio with rt and macros features for async tests
serde_json for constructing tool arguments and assertions

1. Unit testing a Tool

Create a ToolCallContext with a test snapshot, call tool.execute(), and assert on the returned ToolOutput.

use async_trait::async_trait;
use serde_json::{Value, json};
use awaken::contract::tool::{
    Tool, ToolCallContext, ToolDescriptor, ToolError, ToolOutput, ToolResult,
};

struct GreetTool;

#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "greet", "Greet a user by name")
    }

    async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        let name = args["name"]
            .as_str()
            .ok_or_else(|| ToolError::InvalidArguments("missing 'name'".into()))?;
        Ok(ToolResult::success("greet", json!({ "greeting": format!("Hello, {name}!") })).into())
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[tokio::test]
    async fn greet_tool_returns_greeting() {
        let tool = GreetTool;
        let ctx = ToolCallContext::test_default();

        let output = tool.execute(json!({"name": "Alice"}), &ctx).await.unwrap();

        assert!(output.result.is_success());
        assert_eq!(output.result.data["greeting"], "Hello, Alice!");
    }

    #[tokio::test]
    async fn greet_tool_rejects_missing_name() {
        let tool = GreetTool;
        let ctx = ToolCallContext::test_default();

        let err = tool.execute(json!({}), &ctx).await.unwrap_err();

        assert!(matches!(err, ToolError::InvalidArguments(_)));
    }
}

When your tool returns side-effects via ToolOutput::with_command(), assert on the command field:

#[tokio::test]
async fn tool_produces_state_command() {
    let tool = CounterMutationTool { increment: 5 };
    let ctx = ToolCallContext::test_default();

    let output = tool.execute(json!({}), &ctx).await.unwrap();

    assert!(output.result.is_success());
    // The command is opaque at this level; integration tests (section 4)
    // verify that commands are applied correctly to the StateStore.
    assert!(!output.command.is_empty());
}

2. Unit testing a Plugin

Verify that a plugin registers the expected state keys and hooks by creating a PluginRegistrar and calling plugin.register().

use awaken::contract::StateError;
use awaken::state::{StateKey, MergeStrategy, StateKeyOptions, StateCommand};
use awaken::plugins::{Plugin, PluginDescriptor, PluginRegistrar};
use serde::{Serialize, Deserialize};

// -- State key --

struct Counter;

impl StateKey for Counter {
    const KEY: &'static str = "test.counter";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    type Value = usize;
    type Update = usize;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

// -- Plugin --

struct CounterPlugin;

impl Plugin for CounterPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor { name: "counter" }
    }

    fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
        r.register_key::<Counter>(StateKeyOptions::default())?;
        Ok(())
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    use awaken::state::StateStore;

    #[test]
    fn counter_plugin_registers_key() {
        let store = StateStore::new();
        // install_plugin calls register() internally
        store.install_plugin(CounterPlugin).unwrap();

        // The key should now be readable (returns the default value)
        let val = store.read::<Counter>().unwrap_or_default();
        assert_eq!(val, 0);
    }

    #[test]
    fn counter_plugin_rejects_double_registration() {
        let store = StateStore::new();
        store.install_plugin(CounterPlugin).unwrap();

        // Registering the same key again should fail
        let result = store.install_plugin(CounterPlugin);
        assert!(result.is_err());
    }
}

To test a phase hook directly, build a minimal PhaseContext and inspect the returned StateCommand:

#[tokio::test]
async fn audit_hook_appends_entry() {
    let hook = AuditHook;
    let ctx = PhaseContext::test_default();

    let cmd = hook.run(&ctx).await.unwrap();

    // Commit the command to a test store and verify
    let store = StateStore::new();
    store.install_plugin(AuditPlugin).unwrap();
    store.commit(cmd).unwrap();

    let log = store.read::<AuditLogKey>().unwrap();
    assert!(!log.entries.is_empty());
}

3. Unit testing a StateKey

Test apply() mutations directly without any runtime overhead:

use awaken::state::{StateKey, MergeStrategy};

struct HitCounter;

impl StateKey for HitCounter {
    const KEY: &'static str = "test.hit_counter";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    type Value = u64;
    type Update = u64;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn apply_increments_counter() {
        let mut value = u64::default();
        HitCounter::apply(&mut value, 5);
        assert_eq!(value, 5);
        HitCounter::apply(&mut value, 3);
        assert_eq!(value, 8);
    }

    #[test]
    fn apply_from_default_is_identity_free() {
        let mut value = u64::default();
        HitCounter::apply(&mut value, 0);
        assert_eq!(value, 0);
    }
}

For keys with complex value types, test edge cases like empty collections or merge conflicts:

#[test]
fn apply_merge_replaces_entries() {
    let mut log = AuditLog { entries: vec!["old".into()] };
    AuditLogKey::apply(&mut log, AuditLog { entries: vec!["new".into()] });
    // Exclusive merge replaces the entire value
    assert_eq!(log.entries, vec!["new"]);
}

4. Integration testing with a mock LLM

Build a full agent runtime with a scripted LlmExecutor that returns canned responses. This is the primary pattern used in the awaken-runtime integration tests.

use std::sync::{Arc, Mutex};
use async_trait::async_trait;
use serde_json::json;

use awaken::contract::content::ContentBlock;
use awaken::contract::event_sink::VecEventSink;
use awaken::contract::executor::{InferenceExecutionError, InferenceRequest, LlmExecutor};
use awaken::contract::identity::{RunIdentity, RunOrigin};
use awaken::contract::inference::{StopReason, StreamResult};
use awaken::contract::message::{Message, ToolCall};
use awaken::contract::tool::{
    Tool, ToolCallContext, ToolDescriptor, ToolError, ToolOutput, ToolResult,
};
use awaken::loop_runner::{AgentLoopParams, LoopStatePlugin, build_agent_env, run_agent_loop};
use awaken::registry::{AgentResolver, ResolvedAgent};
use awaken::state::StateStore;
use awaken::phase::PhaseRuntime;
use awaken::RuntimeError;

// -- Scripted LLM executor --

struct ScriptedLlm {
    responses: Mutex<Vec<StreamResult>>,
}

impl ScriptedLlm {
    fn new(responses: Vec<StreamResult>) -> Self {
        Self {
            responses: Mutex::new(responses),
        }
    }
}

#[async_trait]
impl LlmExecutor for ScriptedLlm {
    async fn execute(
        &self,
        _request: InferenceRequest,
    ) -> Result<StreamResult, InferenceExecutionError> {
        let mut responses = self.responses.lock().unwrap();
        if responses.is_empty() {
            // Fallback: end the conversation
            Ok(StreamResult {
                content: vec![ContentBlock::text("Done.")],
                tool_calls: vec![],
                usage: None,
                stop_reason: Some(StopReason::EndTurn),
                has_incomplete_tool_calls: false,
            })
        } else {
            Ok(responses.remove(0))
        }
    }

    fn name(&self) -> &str {
        "scripted"
    }
}

// -- Resolver --

struct FixedResolver {
    agent: ResolvedAgent,
}

impl AgentResolver for FixedResolver {
    fn resolve(&self, _agent_id: &str) -> Result<ResolvedAgent, RuntimeError> {
        let mut agent = self.agent.clone();
        agent.env = build_agent_env(&[], &agent)?;
        Ok(agent)
    }
}

// -- Helpers --

fn tool_step(calls: Vec<ToolCall>) -> StreamResult {
    StreamResult {
        content: vec![],
        tool_calls: calls,
        usage: None,
        stop_reason: Some(StopReason::ToolUse),
        has_incomplete_tool_calls: false,
    }
}

fn text_step(text: &str) -> StreamResult {
    StreamResult {
        content: vec![ContentBlock::text(text)],
        tool_calls: vec![],
        usage: None,
        stop_reason: Some(StopReason::EndTurn),
        has_incomplete_tool_calls: false,
    }
}

fn test_identity() -> RunIdentity {
    RunIdentity::new(
        "thread-test".into(),
        None,
        "run-test".into(),
        None,
        "agent".into(),
        RunOrigin::User,
    )
}

// -- Test --

#[tokio::test]
async fn tool_call_flow_end_to_end() {
    // Script: LLM calls get_weather, then responds with text
    let llm = Arc::new(ScriptedLlm::new(vec![
        tool_step(vec![ToolCall::new("c1", "get_weather", json!({"city": "Tokyo"}))]),
        text_step("The weather in Tokyo is sunny."),
    ]));

    let agent = ResolvedAgent::new("test", "model", "You are helpful.", llm)
        .with_tool(Arc::new(GetWeatherTool));

    let store = StateStore::new();
    let runtime = PhaseRuntime::new(store.clone()).unwrap();
    store.install_plugin(LoopStatePlugin).unwrap();

    let resolver = FixedResolver { agent };
    let sink = Arc::new(VecEventSink::new());

    let result = run_agent_loop(AgentLoopParams {
        resolver: &resolver,
        agent_id: "test",
        runtime: &runtime,
        sink: sink.clone(),
        checkpoint_store: None,
        messages: vec![Message::user("What's the weather?")],
        run_identity: test_identity(),
        cancellation_token: None,
        decision_rx: None,
        overrides: None,
        frontend_tools: Vec::new(),
    })
    .await
    .unwrap();

    assert_eq!(result.response, "The weather in Tokyo is sunny.");
    assert_eq!(result.steps, 2); // tool step + text step
}

For simpler cases, use the built-in MockLlmExecutor which returns text-only responses:

use awaken::engine::MockLlmExecutor;

let llm = Arc::new(MockLlmExecutor::new().with_responses(vec!["Hello!".into()]));
let agent = ResolvedAgent::new("test", "model", "system prompt", llm);

5. Testing event streams

Use VecEventSink to capture all events emitted during a run and assert on their sequence and content.

use awaken::contract::event::AgentEvent;
use awaken::contract::event_sink::VecEventSink;
use awaken::contract::lifecycle::TerminationReason;

#[tokio::test]
async fn events_follow_expected_lifecycle() {
    // ... set up runtime and run agent (see section 4) ...

    let events = sink.take();

    // Verify ordering: RunStart -> StepStart -> ... -> StepEnd -> RunFinish
    assert!(matches!(events.first(), Some(AgentEvent::RunStart { .. })));
    assert!(matches!(events.last(), Some(AgentEvent::RunFinish { .. })));

    // Count specific event types
    let step_starts = events.iter().filter(|e| matches!(e, AgentEvent::StepStart { .. })).count();
    let step_ends = events.iter().filter(|e| matches!(e, AgentEvent::StepEnd)).count();
    assert_eq!(step_starts, step_ends, "every StepStart needs a StepEnd");

    // Verify termination reason
    if let Some(AgentEvent::RunFinish { termination, .. }) = events.last() {
        assert_eq!(*termination, TerminationReason::NaturalEnd);
    }

    // Check that tool call events appear in the correct order
    let has_tool_start = events.iter().any(|e| matches!(e, AgentEvent::ToolCallStart { .. }));
    let has_tool_done = events.iter().any(|e| matches!(e, AgentEvent::ToolCallDone { .. }));
    if has_tool_start {
        assert!(has_tool_done, "ToolCallStart without ToolCallDone");
        let start_idx = events.iter().position(|e| matches!(e, AgentEvent::ToolCallStart { .. })).unwrap();
        let done_idx = events.iter().position(|e| matches!(e, AgentEvent::ToolCallDone { .. })).unwrap();
        assert!(start_idx < done_idx);
    }
}

A reusable helper for event type extraction (used in the runtime test suite):

fn event_type(e: &AgentEvent) -> &'static str {
    match e {
        AgentEvent::RunStart { .. } => "run_start",
        AgentEvent::RunFinish { .. } => "run_finish",
        AgentEvent::StepStart { .. } => "step_start",
        AgentEvent::StepEnd => "step_end",
        AgentEvent::TextDelta { .. } => "text_delta",
        AgentEvent::ToolCallStart { .. } => "tool_call_start",
        AgentEvent::ToolCallDone { .. } => "tool_call_done",
        AgentEvent::InferenceComplete { .. } => "inference_complete",
        AgentEvent::StateSnapshot { .. } => "state_snapshot",
        _ => "other",
    }
}

let types: Vec<&str> = events.iter().map(event_type).collect();
assert_eq!(types[0], "run_start");
assert_eq!(*types.last().unwrap(), "run_finish");

6. Testing with a real LLM (live tests)

For end-to-end validation against a real provider, use the GenaiExecutor with environment variables for credentials. Mark these tests with #[ignore] so they only run when explicitly requested.

use awaken::engine::GenaiExecutor;

#[tokio::test]
#[ignore] // Run with: cargo test -- --ignored
async fn live_llm_responds() {
    // Requires: OPENAI_API_KEY or (LLM_BASE_URL + LLM_API_KEY)
    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o-mini".into());
    let llm = Arc::new(GenaiExecutor::new());

    let agent = ResolvedAgent::new(
        "live-test",
        &model,
        "You are a test assistant. Answer in one word.",
        llm,
    );

    // ... set up resolver, store, runtime, sink as in section 4 ...

    let result = run_agent_loop(AgentLoopParams {
        resolver: &resolver,
        agent_id: "live-test",
        runtime: &runtime,
        sink: sink.clone(),
        checkpoint_store: None,
        messages: vec![Message::user("What is 2+2? Answer in one word.")],
        run_identity: test_identity(),
        cancellation_token: None,
        decision_rx: None,
        overrides: None,
        frontend_tools: Vec::new(),
    })
    .await
    .unwrap();

    assert!(!result.response.is_empty());
}

Run live tests with:

# OpenAI-compatible provider
OPENAI_API_KEY=<your-key> LLM_MODEL=gpt-4o-mini cargo test -- --ignored

# Custom endpoint (e.g. BigModel)
LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/ \
  LLM_API_KEY=<key> \
  LLM_MODEL=GLM-4.7-Flash \
  cargo test -- --ignored

See examples/live_test.rs and examples/tool_call_live.rs for complete working examples with console output.

Key Files

crates/awaken-contract/src/contract/tool.rs – Tool trait, ToolCallContext::test_default(), ToolResult, ToolOutput
crates/awaken-contract/src/contract/event_sink.rs – VecEventSink
crates/awaken-runtime/src/engine/mock.rs – MockLlmExecutor
crates/awaken-runtime/src/state/mod.rs – StateStore, StateCommand
crates/awaken-runtime/src/loop_runner/mod.rs – run_agent_loop, AgentLoopParams, AgentRunResult
crates/awaken-runtime/tests/ – integration test suite (event lifecycle, tool side effects)

Add a Tool
Add a Plugin
Build an Agent

Use Shared State

Use this when agents need to share persistent state across thread boundaries, agent types, or delegation trees. Shared state lives in the ProfileStore and is addressed by a typed namespace (ProfileKey) and a key (&str), giving you fine-grained control over who sees what.

Prerequisites

A working awaken agent runtime (see First Agent)
A ProfileStore backend configured on the runtime (e.g. file store or Postgres)

Concepts

Shared state has two dimensions:

Dimension	Type	Purpose
Namespace	`ProfileKey`	Defines what is stored — a compile-time binding between a static string key (`KEY`) and a typed `Value`. Each key is registered once per plugin via `register_profile_key`.
Key	`&str` (or `StateScope` helper)	Defines which instance — a runtime string that partitions storage. Different keys isolate or share data between agents and threads.

Together, (ProfileKey::KEY, key: &str) uniquely identifies a shared state entry in the profile store.

Steps

1. Define a shared state key

Create a struct that implements ProfileKey. The KEY constant is the namespace; the Value type is what gets serialized.

use serde::{Deserialize, Serialize};
use awaken_contract::ProfileKey;

#[derive(Clone, Default, Serialize, Deserialize)]
pub struct TeamContext {
    pub goal: String,
    pub constraints: Vec<String>,
}

pub struct TeamContextKey;

impl ProfileKey for TeamContextKey {
    const KEY: &'static str = "team_context";
    type Value = TeamContext;
}

2. Register in a plugin

Inside your plugin’s register method, call register_profile_key on the registrar.

use awaken_contract::StateError;
use awaken_runtime::plugins::registry::PluginRegistrar;

fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_profile_key::<TeamContextKey>()?;
    Ok(())
}

3. Read and write in a hook

In any phase hook, obtain ProfileAccess from the context and use read / write with a key string. StateScope is a convenience builder for common key patterns — call .as_str() to get the key.

use awaken_contract::StateScope;

async fn execute(&self, ctx: &mut PhaseContext) -> Result<(), anyhow::Error> {
    let profile = ctx.profile().expect("ProfileStore not configured");
    let identity = ctx.snapshot().run_identity();

    // Build a scope key from the current agent's parent thread
    let scope = match &identity.parent_thread_id {
        Some(pid) => StateScope::parent_thread(pid),
        None => StateScope::global(),
    };

    // Read (returns TeamContext::default() if missing)
    let mut team: TeamContext = profile.read::<TeamContextKey>(scope.as_str()).await?;

    // Mutate and write back
    team.goal = "Ship the feature".into();
    profile.write::<TeamContextKey>(scope.as_str(), &team).await?;

    Ok(())
}

4. Choose the right scope

StateScope has several constructors. Pick the one that matches your sharing pattern:

Scenario	Scope	Example
All agents across all threads	`StateScope::global()`	Org-wide configuration
All agents spawned from the same parent thread	`StateScope::parent_thread(id)`	A delegation tree sharing context
All instances of the same agent type	`StateScope::agent_type(name)`	Planner agents sharing learned heuristics
Single thread only	`StateScope::thread(id)`	Thread-local scratchpad
Custom partition	`StateScope::new("custom-key")`	Any application-defined grouping

You can also pass any raw &str directly — StateScope is optional convenience.

When to use shared state

Mechanism	Lifetime	Scope	Best for
`StateKey`	Single run (in-memory snapshot)	One agent thread	Transient per-run state (counters, flags, accumulated context)
`ProfileKey` with agent/system key	Persistent (profile store)	Per-agent or system	Per-agent or per-user settings that don’t cross boundaries
`ProfileKey` with `StateScope` key	Persistent (profile store)	Any `StateScope` string	Cross-agent, cross-thread persistent state

Use ProfileKey with a StateScope key when state must survive across runs and be visible to agents in different threads or of different types.

Common Errors

Symptom	Cause	Fix
`profile key not registered: <ns>`	Key not registered in any plugin	Call `r.register_profile_key::<YourKey>()` in the plugin’s `register` method
Always reads `Value::default()`	Writing and reading use different key strings	Verify both sides construct the same `StateScope` or use the same `&str` key
Data leaks between scopes	Using `StateScope::global()` when a narrower scope is needed	Switch to `parent_thread`, `agent_type`, or `thread` scope

Key Files

Path	Purpose
`crates/awaken-contract/src/contract/shared_state.rs`	`StateScope` type
`crates/awaken-contract/src/contract/profile_store.rs`	`ProfileKey` trait, `ProfileOwner` enum
`crates/awaken-runtime/src/profile/mod.rs`	`ProfileAccess` with `read`, `write`, `delete` methods
`crates/awaken-runtime/src/plugins/registry.rs`	`PluginRegistrar::register_profile_key` registration

State and Snapshot Model
State Keys
Add a Plugin

Overview

The awaken crate is the public facade for the Awaken agent framework. It re-exports types from the internal awaken-contract and awaken-runtime crates so that downstream code only needs a single dependency.

Module re-exports

Facade path	Source crate	Contents
`awaken::contract`	`awaken-contract`	Tool trait, events, messages, suspension, lifecycle
`awaken::model`	`awaken-contract`	Phase, EffectSpec, ScheduledActionSpec, JsonValue
`awaken::registry_spec`	`awaken-contract`	AgentSpec, ModelSpec, ProviderSpec, McpServerSpec, PluginConfigKey
`awaken::state`	`awaken-contract` + `awaken-runtime`	StateKey, StateMap, Snapshot, StateStore, MutationBatch
`awaken::agent`	`awaken-runtime`	Agent configuration and state
`awaken::builder`	`awaken-runtime`	AgentRuntimeBuilder, BuildError
`awaken::context`	`awaken-runtime`	PhaseContext
`awaken::engine`	`awaken-runtime`	LLM engine abstraction
`awaken::execution`	`awaken-runtime`	ExecutionEnv
`awaken::extensions`	`awaken-runtime`	Built-in extension infrastructure
`awaken::loop_runner`	`awaken-runtime`	Agent loop runner
`awaken::phase`	`awaken-runtime`	PhaseRuntime, PhaseHook
`awaken::plugins`	`awaken-runtime`	Plugin, PluginDescriptor, PluginRegistrar
`awaken::policies`	`awaken-runtime`	Context window and retry policies
`awaken::registry`	`awaken-runtime`	AgentResolver, ResolvedAgent
`awaken::runtime`	`awaken-runtime`	AgentRuntime
`awaken::stores`	`awaken-stores`	File and Postgres store implementations

Feature-gated modules

Facade path	Feature flag	Source crate
`awaken::ext_permission`	`permission`	`awaken-ext-permission`
`awaken::ext_observability`	`observability`	`awaken-ext-observability`
`awaken::ext_mcp`	`mcp`	`awaken-ext-mcp`
`awaken::ext_skills`	`skills`	`awaken-ext-skills`
`awaken::ext_generative_ui`	`generative-ui`	`awaken-ext-generative-ui`
`awaken::ext_reminder`	`reminder`	`awaken-ext-reminder`
`awaken::server`	`server`	`awaken-server`

Root-level re-exports

The following types are re-exported at the crate root for convenience:

From awaken-contract: AgentSpec, EffectSpec, FailedScheduledActions, JsonValue, KeyScope, MergeStrategy, PendingScheduledActions, PersistedState, Phase, PluginConfigKey, ScheduledActionSpec, Snapshot, StateError, StateKey, StateKeyOptions, StateMap, TypedEffect, UnknownKeyPolicy

From awaken-runtime: AgentResolver, AgentRuntime, AgentRuntimeBuilder, BuildError, CancellationToken, CommitEvent, CommitHook, DEFAULT_MAX_PHASE_ROUNDS, ExecutionEnv, MutationBatch, PhaseContext, PhaseHook, PhaseRuntime, Plugin, PluginDescriptor, PluginRegistrar, ResolvedAgent, RunRequest, RuntimeError, StateCommand, StateStore, TypedEffectHandler, TypedScheduledActionHandler

Feature flags

Flag	Default	Description
`permission`	yes	Tool-level permission gating (HITL)
`observability`	yes	Tracing and metrics integration
`mcp`	yes	MCP (Model Context Protocol) tool bridge
`skills`	yes	Skills subsystem for reusable agent capabilities
`reminder`	yes	Reminder extension for injecting context messages
`server`	yes	HTTP server with SSE streaming and protocol adapters
`generative-ui`	yes	Generative UI component streaming
`full`	yes	Enables all of the above

Introduction
Scheduled Actions
Effects

Tool Trait

The Tool trait is the primary extension point for giving agents capabilities. Tools are stateless functions that receive JSON arguments and a read-only context, and return a ToolOutput.

Trait definition

#[async_trait]
pub trait Tool: Send + Sync {
    /// Return the descriptor for this tool.
    fn descriptor(&self) -> ToolDescriptor;

    /// Validate arguments before execution. Default: accept all.
    fn validate_args(&self, _args: &Value) -> Result<(), ToolError> {
        Ok(())
    }

    /// Execute the tool with the given arguments and context.
    async fn execute(
        &self,
        args: Value,
        ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError>;
}

Crate path: awaken::contract::tool::Tool

ToolDescriptor

Describes a tool’s identity and parameter schema. Registered with the runtime and sent to the LLM as available functions.

pub struct ToolDescriptor {
    pub id: String,
    pub name: String,
    pub description: String,
    /// JSON Schema for parameters.
    pub parameters: Value,
    pub category: Option<String>,
}

Builder methods:

ToolDescriptor::new(id, name, description) -> Self
    .with_parameters(schema: Value) -> Self
    .with_category(category: impl Into<String>) -> Self

ToolResult

Returned by Tool::execute. Carries the execution outcome back to the agent loop.

pub struct ToolResult {
    pub tool_name: String,
    pub status: ToolStatus,
    pub data: Value,
    pub message: Option<String>,
    pub suspension: Option<Box<SuspendTicket>>,
}

ToolStatus

pub enum ToolStatus {
    Success,  // Execution succeeded
    Pending,  // Suspended, waiting for external resume
    Error,    // Execution failed; content sent back to LLM
}

Constructors

Method	Status	Use case
`ToolResult::success(name, data)`	`Success`	Normal completion
`ToolResult::success_with_message(name, data, msg)`	`Success`	Completion with description
`ToolResult::error(name, message)`	`Error`	Recoverable failure
`ToolResult::error_with_code(name, code, message)`	`Error`	Structured error with code
`ToolResult::suspended(name, message)`	`Pending`	HITL suspension
`ToolResult::suspended_with(name, message, ticket)`	`Pending`	Suspension with ticket

Predicates

is_success() -> bool
is_pending() -> bool
is_error() -> bool
to_json() -> Value

ToolError

Errors returned from validate_args or execute. Unlike ToolResult::Error (which is sent to the LLM), a ToolError aborts the tool call.

pub enum ToolError {
    InvalidArguments(String),
    ExecutionFailed(String),
    Denied(String),
    NotFound(String),
    Internal(String),
}

ToolCallContext

Read-only context provided to a tool during execution.

pub struct ToolCallContext {
    pub call_id: String,
    pub tool_name: String,
    pub run_identity: RunIdentity,
    pub agent_spec: Arc<AgentSpec>,
    pub snapshot: Snapshot,
    pub activity_sink: Option<Arc<dyn EventSink>>,
}

Methods

/// Read a typed state key from the snapshot.
fn state<K: StateKey>(&self) -> Option<&K::Value>

/// Report an activity snapshot for this tool call.
async fn report_activity(&self, activity_type: &str, content: &str)

/// Report an incremental activity delta.
async fn report_activity_delta(&self, activity_type: &str, patch: Value)

/// Report structured tool call progress.
async fn report_progress(
    &self,
    status: ProgressStatus,
    message: Option<&str>,
    progress: Option<f64>,
)

Examples

Minimal tool

use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use serde_json::{Value, json};

struct Greet;

#[async_trait]
impl Tool for Greet {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "greet", "Greet a user by name")
            .with_parameters(json!({
                "type": "object",
                "properties": {
                    "name": { "type": "string" }
                },
                "required": ["name"]
            }))
    }

    async fn execute(
        &self,
        args: Value,
        _ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        let name = args["name"]
            .as_str()
            .ok_or_else(|| ToolError::InvalidArguments("name required".into()))?;
        Ok(ToolResult::success("greet", json!({ "greeting": format!("Hello, {name}!") })).into())
    }
}

Reading state from context

use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use awaken::state::StateKey;
use serde::{Serialize, Deserialize};
use serde_json::{Value, json};

// Assume a state key is defined elsewhere:
// struct UserPreferences;
// impl StateKey for UserPreferences { ... }

struct GetPreferences;

#[async_trait]
impl Tool for GetPreferences {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("get_prefs", "get_preferences", "Get user preferences")
    }

    async fn execute(
        &self,
        _args: Value,
        ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        // Read typed state through the snapshot
        // let prefs = ctx.state::<UserPreferences>().cloned().unwrap_or_default();
        // Ok(ToolResult::success("get_prefs", serde_json::to_value(&prefs).unwrap()).into())
        Ok(ToolResult::success("get_prefs", json!({})).into())
    }
}

Tool execution hooks

Every tool call passes through plugin hooks before and after execution. This allows plugins to intercept, observe, or modify tool call behavior without changing the tool itself.

Full lifecycle

LLM selects tool
  -> validate_args()
  -> BeforeToolExecute phase (plugins run hooks)
     Plugins may schedule ToolInterceptAction:
       Block    -> run terminates with reason
       Suspend  -> run pauses (HITL), tool not executed
       SetResult -> tool skipped, pre-built result used
  -> execute()                (only if not intercepted)
  -> AfterToolExecute phase   (plugins run hooks)
     Plugins observe the ToolResult and may modify state

BeforeToolExecute

Runs once per tool call, after argument validation. Plugins receive a PhaseContext containing the tool name, call ID, and validated arguments. Hooks return a StateCommand that may schedule ToolInterceptAction to block, suspend, or short-circuit the call.

When multiple intercepts are scheduled, priority is: Block > Suspend > SetResult.

AfterToolExecute

Runs after execute() completes (or after an intercept produces a result). Plugins receive the PhaseContext with the ToolResult attached. Hooks can update state, emit events, or schedule actions for subsequent phases.

ToolCallStatus transitions

Each tool call tracks a ToolCallStatus through its lifecycle:

New -> Running -> Succeeded
                  Failed
                  Suspended -> Resuming -> Running -> ...
                  Cancelled

Terminal states (Succeeded, Failed, Cancelled) cannot transition further.

See Plugin Internals for intercept priority details and the full phase convergence loop.

First Tool Tutorial
Add a Tool

Scheduled Actions

Scheduled actions are the primary mechanism for plugins and tools to request side-effects during a phase execution cycle. Any hook, tool, or external module can schedule an action via StateCommand::schedule_action::<A>(payload). The runtime dispatches the action to its registered handler during the EXECUTE stage of the target phase.

How it works

Hook / Tool                    Runtime
    |                            |
    |-- StateCommand ----------->|  (contains scheduled_actions)
    |   schedule_action::<A>(p)  |
    |                            |-- commit state updates
    |                            |-- dispatch to handler(A, p)
    |                            |      |
    |                            |      |-- handler returns StateCommand
    |                            |      |   (may schedule more actions)
    |                            |<-----'
    |                            |-- commit handler results

Scheduling from a hook

use awaken_runtime::agent::state::ExcludeTool;

async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
    let mut cmd = StateCommand::new();
    cmd.schedule_action::<ExcludeTool>("dangerous_tool".into())?;
    Ok(cmd)
}

Scheduling from a tool

use awaken_runtime::agent::state::AddContextMessage;
use awaken_contract::contract::context_message::ContextMessage;

async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
    let mut cmd = StateCommand::new();
    cmd.schedule_action::<AddContextMessage>(
        ContextMessage::system("my_tool.hint", "Remember to check the docs."),
    )?;
    Ok(ToolOutput::with_command(
        ToolResult::success("my_tool", json!({"ok": true})),
        cmd,
    ))
}

Core Actions (awaken-runtime)

These are always available. Registered by the internal LoopActionHandlersPlugin.

AddContextMessage


Key	`runtime.add_context_message`
Phase	`BeforeInference`
Payload	`ContextMessage`
Import	`awaken_runtime::agent::state::AddContextMessage`

Injects a context message into the LLM conversation for the current step. Messages can be persistent (survive across steps), ephemeral (one-shot), or throttled (cooldown-based).

Used by: skills plugin (skill catalog), reminder plugin (rule-based hints), deferred-tools plugin (deferred tool list), custom hooks.

cmd.schedule_action::<AddContextMessage>(
    ContextMessage::system_persistent("my_plugin.info", "Always verify inputs."),
)?;

SetInferenceOverride


Key	`runtime.set_inference_override`
Phase	`BeforeInference`
Payload	`InferenceOverride`
Import	`awaken_runtime::agent::state::SetInferenceOverride`

Overrides inference parameters (model, temperature, max_tokens, top_p) for the current step only. Multiple overrides are merged; last-writer-wins per field.

cmd.schedule_action::<SetInferenceOverride>(InferenceOverride {
    temperature: Some(0.0),  // force deterministic
    ..Default::default()
})?;

ExcludeTool


Key	`runtime.exclude_tool`
Phase	`BeforeInference`
Payload	`String` (tool ID)
Import	`awaken_runtime::agent::state::ExcludeTool`

Removes a tool from the set offered to the LLM for the current step. Multiple exclusions are additive.

Used by: permission plugin (unconditionally denied tools), deferred-tools plugin (deferred tools replaced by ToolSearch).

cmd.schedule_action::<ExcludeTool>("rm".into())?;

IncludeOnlyTools


Key	`runtime.include_only_tools`
Phase	`BeforeInference`
Payload	`Vec<String>` (tool IDs)
Import	`awaken_runtime::agent::state::IncludeOnlyTools`

Restricts the tool set to only the listed IDs. Multiple IncludeOnlyTools actions are unioned. Combined with ExcludeTool, exclusions are applied after the include-only filter.

cmd.schedule_action::<IncludeOnlyTools>(vec!["search".into(), "calculator".into()])?;

ToolInterceptAction


Key	`tool_intercept`
Phase	`BeforeToolExecute`
Payload	`ToolInterceptPayload`
Import	`awaken_contract::contract::tool_intercept::ToolInterceptAction`

Intercepts a tool call before execution. Three outcomes:

Variant	Effect
`Block { reason }`	Tool execution blocked, run terminates with the reason.
`Suspend(SuspendTicket)`	Tool execution deferred; run pauses awaiting external decision (HITL).
`SetResult(ToolResult)`	Tool execution short-circuited with a pre-built result.

When multiple intercepts are scheduled, priority is: Block > Suspend > SetResult.

Used by: permission plugin (ask/deny policies).

use awaken_contract::contract::tool_intercept::{ToolInterceptAction, ToolInterceptPayload};

cmd.schedule_action::<ToolInterceptAction>(
    ToolInterceptPayload::Block { reason: "Tool is disabled".into() },
)?;

Deferred-Tools Actions (awaken-ext-deferred-tools)

Available when the deferred-tools plugin is active.

DeferToolAction


Key	`deferred_tools.defer`
Phase	`BeforeInference`
Payload	`Vec<String>` (tool IDs)
Import	`awaken_ext_deferred_tools::state::DeferToolAction`

Moves tools from eager to deferred mode. Deferred tools are excluded from the LLM tool set and made available via ToolSearch instead, reducing prompt token usage.

The handler updates DeferralState, setting each tool’s mode to Deferred.

cmd.schedule_action::<DeferToolAction>(vec!["rarely_used_tool".into()])?;

PromoteToolAction


Key	`deferred_tools.promote`
Phase	`BeforeInference`
Payload	`Vec<String>` (tool IDs)
Import	`awaken_ext_deferred_tools::state::PromoteToolAction`

Moves tools from deferred to eager mode. Promoted tools are included in the LLM tool set for subsequent steps.

The handler updates DeferralState, setting each tool’s mode to Eager.

Typically triggered automatically when ToolSearch returns results, but can be scheduled manually by any plugin or tool.

cmd.schedule_action::<PromoteToolAction>(vec!["needed_tool".into()])?;

Plugin Action Usage Matrix

Which plugins schedule which actions:

Plugin	AddContext	Exclude	Intercept	Defer	Promote
permission		X	X
skills	X
reminder	X
deferred-tools	X	X		X	X
observability
mcp
generative-ui

Defining Custom Actions

Plugins can define their own actions by implementing ScheduledActionSpec and registering a handler via PluginRegistrar::register_scheduled_action.

use awaken_contract::model::{Phase, ScheduledActionSpec};

pub struct MyCustomAction;

impl ScheduledActionSpec for MyCustomAction {
    const KEY: &'static str = "my_plugin.custom_action";
    const PHASE: Phase = Phase::BeforeInference;
    type Payload = MyPayload;
}

Then in your plugin’s register():

fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_scheduled_action::<MyCustomAction, _>(MyHandler)?;
    Ok(())
}

Other plugins and tools can then schedule your action:

cmd.schedule_action::<MyCustomAction>(my_payload)?;

Convergence and cascading

Scheduled actions execute within the phase convergence loop. After each round of action dispatch, the runtime checks whether new actions were produced. If so, the loop repeats to process them.

How the loop works

Phase EXECUTE stage:
  round 1: dispatch queued actions -> handlers return StateCommands
           commit state, collect newly scheduled actions
  round 2: dispatch new actions    -> handlers may schedule more
           ...
  round N: no new actions          -> phase converges, loop exits

An action handler can schedule new actions for the same phase, which causes another round. This enables cascading behaviors (e.g., a handler adds a context message, which triggers a filter action from another plugin).

Limits

The loop is bounded by DEFAULT_MAX_PHASE_ROUNDS (16). If actions are still being produced after 16 rounds, the runtime returns a StateError::PhaseRunLoopExceeded error with the phase name and round count. This prevents infinite loops from misconfigured or recursive handlers.

Failed actions

When an action handler returns an error, the action is not retried. Instead, it is recorded in the FailedScheduledActions state key, which holds a list of FailedScheduledAction entries (action key, payload, and error message). Plugins or tests can inspect this key to detect handler failures.

let failed = store.read::<FailedScheduledActions>().unwrap_or_default();
assert!(failed.is_empty(), "expected no failed actions");

See Plugin Internals for the full convergence loop description and phase execution model.

Effects

Effects are typed, fire-and-forget side-effect events. Unlike scheduled actions (which execute within a phase convergence loop and can cascade), effects are dispatched after commit and are terminal – handlers cannot produce new StateCommands, actions, or effects.

Typical use cases: audit logging, external webhook calls, metric emission, notification delivery.

EffectSpec trait

Every effect type implements EffectSpec:

pub trait EffectSpec: 'static + Send + Sync {
    /// Unique string identifier for this effect kind.
    const KEY: &'static str;

    /// The payload carried by the effect.
    /// Must be serializable so the runtime can store it as JSON internally.
    type Payload: Serialize + DeserializeOwned + Send + Sync + 'static;
}

Crate path: awaken::model::EffectSpec (re-exported from awaken-contract)

KEY must be globally unique across all registered effects. Convention: "<plugin>.<effect_name>", e.g. "audit.record".

Emitting effects

Call StateCommand::emit::<E>(payload) from any hook or action handler:

use awaken::{StateCommand, StateError};

async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
    let mut cmd = StateCommand::new();
    cmd.emit::<AuditEffect>(AuditPayload {
        action: "user_login".into(),
        actor: "agent-1".into(),
    })?;
    Ok(cmd)
}

Effects are collected in StateCommand::effects and dispatched only after the command’s state mutations are committed. Tools can also emit effects by including them in the StateCommand returned alongside a ToolResult.

TypedEffect wrapper

TypedEffect is the runtime’s type-erased envelope for effects:

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct TypedEffect {
    pub key: String,
    pub payload: JsonValue,
}

Two methods bridge between the typed and erased worlds:

TypedEffect::from_spec::<E>(payload) – serializes a typed payload into a TypedEffect. Called internally by StateCommand::emit.
TypedEffect::decode::<E>() – deserializes the JSON payload back into the concrete E::Payload type.

You rarely need to use TypedEffect directly; StateCommand::emit and the handler trait handle serialization transparently.

Registering effect handlers

Effect handlers implement TypedEffectHandler<E>:

#[async_trait]
pub trait TypedEffectHandler<E>: Send + Sync + 'static
where
    E: EffectSpec,
{
    async fn handle_typed(
        &self,
        payload: E::Payload,
        snapshot: &Snapshot,
    ) -> Result<(), String>;
}

Key points:

The handler receives the post-commit Snapshot, so it sees the state that includes the mutations from the command that emitted the effect.
The return type is Result<(), String> – not Result<(), StateError>. Handlers report errors as plain strings; the runtime logs them but does not propagate them.

fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_effect::<AuditEffect, _>(AuditEffectHandler)?;
    Ok(())
}

Duplicate registrations (same E::KEY) produce StateError::EffectHandlerAlreadyRegistered.

Dispatch lifecycle

Collect – A hook, action handler, or tool calls cmd.emit::<E>(payload). The TypedEffect is appended to StateCommand::effects.
Validate – When submit_command processes the command, every effect key is checked against the registered handlers. If any key has no registered handler, the command is rejected with StateError::UnknownEffectHandler before any state is committed. This is a fail-fast guarantee.
Commit – State mutations (MutationBatch) are committed to the store.
Dispatch – After a successful commit, each effect is dispatched to its handler via handle_typed(payload, snapshot). The snapshot reflects post-commit state.
Error handling – Handler failures are logged and counted in EffectDispatchReport but do not roll back the commit or block subsequent effects. The runtime continues dispatching remaining effects.

Hook / Tool                         Runtime
    |                                 |
    |-- StateCommand (with effects) ->|
    |                                 |-- validate all effect keys
    |                                 |   (fail-fast if unknown)
    |                                 |-- commit state mutations
    |                                 |-- dispatch effects sequentially
    |                                 |      handler(payload, snapshot)
    |                                 |-- return SubmitCommandReport
    |<--------------------------------|

Worked example

Define an effect:

use awaken::EffectSpec;
use serde::{Deserialize, Serialize};

/// Payload for audit log entries.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditPayload {
    pub action: String,
    pub actor: String,
}

/// Effect spec for audit logging.
pub struct AuditEffect;

impl EffectSpec for AuditEffect {
    const KEY: &'static str = "audit.record";
    type Payload = AuditPayload;
}

Emit it from a phase hook:

use async_trait::async_trait;
use awaken::{PhaseContext, PhaseHook, StateCommand, StateError};

pub struct AuditHook;

#[async_trait]
impl PhaseHook for AuditHook {
    async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
        let mut cmd = StateCommand::new();
        cmd.emit::<AuditEffect>(AuditPayload {
            action: "phase_entered".into(),
            actor: "system".into(),
        })?;
        Ok(cmd)
    }
}

Handle the effect:

use async_trait::async_trait;
use awaken::{Snapshot, TypedEffectHandler};

pub struct AuditEffectHandler;

#[async_trait]
impl TypedEffectHandler<AuditEffect> for AuditEffectHandler {
    async fn handle_typed(
        &self,
        payload: AuditPayload,
        _snapshot: &Snapshot,
    ) -> Result<(), String> {
        tracing::info!(
            action = %payload.action,
            actor = %payload.actor,
            "audit effect dispatched"
        );
        // In production: write to external audit store, send webhook, etc.
        Ok(())
    }
}

Wire it all together in a plugin:

use awaken::{Plugin, PluginDescriptor, PluginRegistrar, StateError};
use awaken::model::Phase;

pub struct AuditPlugin;

impl Plugin for AuditPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor::new("audit", "Audit logging via effects")
    }

    fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
        r.register_effect::<AuditEffect, _>(AuditEffectHandler)?;
        r.register_phase_hook("audit", Phase::RunStart, AuditHook)?;
        Ok(())
    }
}

Effects vs Scheduled Actions

	Effects	Scheduled Actions
Timing	Post-commit	Within phase convergence loop
Can cascade	No	Yes (handlers return `StateCommand`)
Can produce StateCommand	No	Yes
Failure handling	Logged, non-blocking	Error propagated to caller
State visibility	Post-commit snapshot	Pre-commit context
Use case	External I/O, logging, metrics	Internal control flow, state manipulation

Choose effects when you need to trigger external side-effects that should not influence the agent’s state convergence. Choose scheduled actions when the handler needs to mutate state or schedule further work.

Events

The agent loop emits AgentEvent values as it executes. Events are streamed to clients via SSE and consumed by protocol encoders.

AgentEvent

All variants are tagged with event_type in their JSON serialization (#[serde(tag = "event_type", rename_all = "snake_case")]).

pub enum AgentEvent {
    RunStart {
        thread_id: String,
        run_id: String,
        parent_run_id: Option<String>,    // omitted when None
    },

    RunFinish {
        thread_id: String,
        run_id: String,
        result: Option<Value>,            // omitted when None
        termination: TerminationReason,
    },

    TextDelta { delta: String },

    ReasoningDelta { delta: String },

    ReasoningEncryptedValue { encrypted_value: String },

    ToolCallStart { id: String, name: String },

    ToolCallDelta { id: String, args_delta: String },

    ToolCallReady {
        id: String,
        name: String,
        arguments: Value,
    },

    ToolCallDone {
        id: String,
        message_id: String,
        result: ToolResult,
        outcome: ToolCallOutcome,
    },

    ToolCallStreamDelta {
        id: String,
        name: String,
        delta: String,
    },

    ToolCallResumed { target_id: String, result: Value },

    MessagesSnapshot { messages: Vec<Value> },

    ActivitySnapshot {
        message_id: String,
        activity_type: String,
        content: Value,
        replace: Option<bool>,            // omitted when None
    },

    ActivityDelta {
        message_id: String,
        activity_type: String,
        patch: Vec<Value>,
    },

    StepStart { message_id: String },

    StepEnd,

    InferenceComplete {
        model: String,
        usage: Option<TokenUsage>,        // omitted when None
        duration_ms: u64,
    },

    StateSnapshot { snapshot: Value },

    StateDelta { delta: Vec<Value> },

    Error {
        message: String,
        code: Option<String>,             // omitted when None
    },
}

Crate path: awaken::contract::event::AgentEvent

Helper

impl AgentEvent {
    /// Extract the response text from a RunFinish result value.
    pub fn extract_response(result: &Option<Value>) -> String
}

StreamEvent

Wire-format envelope that wraps an AgentEvent with sequencing metadata. Sent over SSE as JSON.

pub struct StreamEvent {
    /// Monotonically increasing sequence number within a run.
    pub seq: u64,
    /// ISO 8601 timestamp.
    pub timestamp: String,
    /// The wrapped agent event (flattened via #[serde(flatten)]).
    pub event: AgentEvent,
}

Constructor

fn new(seq: u64, timestamp: impl Into<String>, event: AgentEvent) -> Self

RunInput

Input to start or resume a run.

#[serde(tag = "type", rename_all = "snake_case")]
pub enum RunInput {
    /// A new user message to process.
    UserMessage { text: String },
    /// Resume a suspended run with a decision.
    ResumeDecision {
        tool_call_id: String,
        action: ResumeDecisionAction,
        payload: Value,       // omitted when null
    },
}

RunOutput

Type alias for the event stream returned by a run:

pub type RunOutput = futures::stream::BoxStream<'static, AgentEvent>;

TerminationReason

Why a run terminated. Serialized as { "type": "...", "value": ... }.

#![allow(unused)]
fn main() {
pub struct StoppedReason { pub code: String, pub detail: Option<String> }
pub enum TerminationReason {
    NaturalEnd,
    BehaviorRequested,
    Stopped(StoppedReason),
    Cancelled,
    Blocked(String),
    Suspended,
    Error(String),
}
}

ToolCallOutcome

#![allow(unused)]
fn main() {
pub enum ToolCallOutcome {
    Succeeded,
    Failed,
    Suspended,
}
}

TokenUsage

#![allow(unused)]
fn main() {
pub struct TokenUsage {
    pub prompt_tokens: Option<i32>,
    pub completion_tokens: Option<i32>,
    pub total_tokens: Option<i32>,
    pub cache_read_tokens: Option<i32>,
    pub cache_creation_tokens: Option<i32>,
    pub thinking_tokens: Option<i32>,
}
}

All fields are omitted from JSON when None. TokenUsage::default() produces all None values.

Run Lifecycle and Phases

HTTP API

The awaken-server crate (feature flag server) exposes an HTTP API via Axum. Most responses are JSON. Streaming endpoints use Server-Sent Events (SSE).

This page mirrors the current route tree in crates/awaken-server/src/routes.rs and crates/awaken-server/src/config_routes.rs.

Health and metrics

Method	Path	Description
`GET`	`/health`	Readiness probe. Checks store connectivity and returns `200` or `503`
`GET`	`/health/live`	Liveness probe. Always returns `200 OK`
`GET`	`/metrics`	Prometheus scrape endpoint

Threads

Method	Path	Description
`GET`	`/v1/threads`	List thread IDs
`POST`	`/v1/threads`	Create a thread. Body: `{ "title": "..." }`
`GET`	`/v1/threads/summaries`	List thread summaries
`GET`	`/v1/threads/:id`	Get a thread by ID
`PATCH`	`/v1/threads/:id`	Update thread metadata
`DELETE`	`/v1/threads/:id`	Delete a thread
`POST`	`/v1/threads/:id/cancel`	Cancel a specific queued or running job addressed by this thread ID. Returns `cancel_requested`.
`POST`	`/v1/threads/:id/decision`	Submit a HITL decision for a waiting run on this thread
`POST`	`/v1/threads/:id/interrupt`	Interrupt the thread: bumps the thread generation, supersedes all pending queued jobs, and cancels the active run. Returns `interrupt_requested` with `superseded_jobs` count. Unlike `/cancel`, this performs a clean-slate interrupt via `mailbox.interrupt()`.
`PATCH`	`/v1/threads/:id/metadata`	Alias for thread metadata updates
`GET`	`/v1/threads/:id/messages`	List thread messages
`POST`	`/v1/threads/:id/messages`	Submit messages as a background run on this thread
`POST`	`/v1/threads/:id/mailbox`	Push a message payload to the thread mailbox
`GET`	`/v1/threads/:id/mailbox`	List mailbox jobs for the thread
`GET`	`/v1/threads/:id/runs`	List runs for the thread
`GET`	`/v1/threads/:id/runs/latest`	Get the latest run for the thread

Runs

Method	Path	Description
`GET`	`/v1/runs`	List runs
`POST`	`/v1/runs`	Start a run and stream events over SSE
`GET`	`/v1/runs/:id`	Get a run record
`POST`	`/v1/runs/:id/inputs`	Submit follow-up input messages as a background run on the same thread
`POST`	`/v1/runs/:id/cancel`	Cancel a run by run ID
`POST`	`/v1/runs/:id/decision`	Submit a HITL decision by run ID

Config and capabilities

These endpoints are exposed by config_routes(). Read and schema routes require AppState to be constructed with a config store. Mutation routes additionally require a config runtime manager so writes can validate and publish a new registry snapshot. Without the required config wiring, the routes return 400 with config management API not enabled.

Method	Path	Description
`GET`	`/v1/capabilities`	List registered agents, tools, plugins, models, providers, and config namespaces
`GET`	`/v1/config/:namespace`	List entries in a config namespace
`POST`	`/v1/config/:namespace`	Create an entry; the body must contain `"id"`
`GET`	`/v1/config/:namespace/:id`	Get one config entry
`PUT`	`/v1/config/:namespace/:id`	Replace a config entry
`DELETE`	`/v1/config/:namespace/:id`	Delete a config entry
`GET`	`/v1/config/:namespace/$schema`	Return the JSON Schema for a namespace
`GET`	`/v1/agents`	Convenience alias for `/v1/config/agents`
`GET`	`/v1/agents/:id`	Convenience alias for `/v1/config/agents/:id`

Current built-in namespaces:

agents
models
providers
mcp-servers

AI SDK v6 routes

Method	Path	Description
`POST`	`/v1/ai-sdk/chat`	Start a chat run and stream protocol-encoded events
`POST`	`/v1/ai-sdk/threads/:thread_id/runs`	Start a thread-scoped AI SDK run
`POST`	`/v1/ai-sdk/agents/:agent_id/runs`	Start an agent-scoped AI SDK run
`GET`	`/v1/ai-sdk/chat/:thread_id/stream`	Resume an SSE stream by thread ID
`GET`	`/v1/ai-sdk/threads/:thread_id/stream`	Alias for stream resume by thread ID
`GET`	`/v1/ai-sdk/threads/:thread_id/messages`	List thread messages
`POST`	`/v1/ai-sdk/threads/:thread_id/cancel`	Cancel the active or queued run on a thread
`POST`	`/v1/ai-sdk/threads/:thread_id/interrupt`	Interrupt a thread (bump generation, supersede pending jobs, cancel active run)

AG-UI routes

Method	Path	Description
`POST`	`/v1/ag-ui/run`	Start an AG-UI run and stream AG-UI events
`POST`	`/v1/ag-ui/threads/:thread_id/runs`	Start a thread-scoped AG-UI run
`POST`	`/v1/ag-ui/agents/:agent_id/runs`	Start an agent-scoped AG-UI run
`POST`	`/v1/ag-ui/threads/:thread_id/interrupt`	Interrupt a thread
`GET`	`/v1/ag-ui/threads/:id/messages`	List thread messages

A2A routes

Method	Path	Description
`GET`	`/.well-known/agent-card.json`	Get the public/default agent card
`POST`	`/v1/a2a/message:send`	Send a message to the public/default A2A agent
`POST`	`/v1/a2a/message:stream`	Streaming send over SSE
`GET`	`/v1/a2a/tasks`	List A2A tasks
`GET`	`/v1/a2a/tasks/:task_id`	Get task status
`POST`	`/v1/a2a/tasks/:task_id:cancel`	Cancel a task
`POST`	`/v1/a2a/tasks/:task_id:subscribe`	Subscribe to task updates over SSE
`POST`	`/v1/a2a/tasks/:task_id/pushNotificationConfigs`	Create a push notification config
`GET`	`/v1/a2a/tasks/:task_id/pushNotificationConfigs`	List push notification configs
`GET`	`/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_id`	Get a push notification config
`DELETE`	`/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_id`	Delete a push notification config
`GET`	`/v1/a2a/extendedAgentCard`	Get the extended agent card; returns `501` unless enabled
`POST`	`/v1/a2a/:tenant/message:send`	Send a message to a tenant-scoped agent
`POST`	`/v1/a2a/:tenant/message:stream`	Tenant-scoped streaming send
`GET`	`/v1/a2a/:tenant/tasks`	List tasks for a tenant-scoped agent
`GET`	`/v1/a2a/:tenant/tasks/:task_id`	Get tenant-scoped task status
`POST`	`/v1/a2a/:tenant/tasks/:task_id:cancel`	Cancel a tenant-scoped task
`POST`	`/v1/a2a/:tenant/tasks/:task_id:subscribe`	Subscribe to tenant-scoped task updates
`GET`	`/v1/a2a/:tenant/extendedAgentCard`	Get the tenant-scoped extended agent card

MCP HTTP routes

Method	Path	Description
`POST`	`/v1/mcp`	MCP JSON-RPC request/response endpoint
`GET`	`/v1/mcp`	Reserved for MCP server-initiated SSE; currently returns `405`

Common query parameters

offset — number of items to skip
limit — maximum items to return, clamped to 1..=200
cursor — message-history pagination cursor; when provided it takes precedence over offset, and history responses return next_cursor
status — run filter: running, waiting, or done
visibility — message filter: omit for external-only, set to all to include internal messages

Error format

Most route groups return:

{ "error": "human-readable message" }

MCP routes return JSON-RPC error objects instead of the generic shape above.

Expose HTTP with SSE
Config

Config

AgentSpec

The serializable agent definition. Can be loaded from JSON/YAML or constructed programmatically via builder methods.

pub struct AgentSpec {
    pub id: String,
    pub model: String,
    pub system_prompt: String,
    pub max_rounds: usize,                          // default: 16
    pub max_continuation_retries: usize,            // default: 2
    pub context_policy: Option<ContextWindowPolicy>,
    pub plugin_ids: Vec<String>,
    pub active_hook_filter: HashSet<String>,
    pub allowed_tools: Option<Vec<String>>,
    pub excluded_tools: Option<Vec<String>>,
    pub endpoint: Option<RemoteEndpoint>,
    pub delegates: Vec<String>,
    pub sections: HashMap<String, Value>,
    pub registry: Option<String>,
}

Crate path: awaken::registry_spec::AgentSpec (re-exported at awaken::AgentSpec)

Builder methods

AgentSpec::new(id) -> Self
    .with_model(model) -> Self
    .with_system_prompt(prompt) -> Self
    .with_max_rounds(n) -> Self
    .with_hook_filter(plugin_id) -> Self
    .with_config::<K>(config) -> Result<Self, StateError>
    .with_delegate(agent_id) -> Self
    .with_endpoint(endpoint) -> Self
    .with_section(key, value: Value) -> Self

Typed config access

/// Read a typed plugin config section. Returns default if missing.
fn config<K: PluginConfigKey>(&self) -> Result<K::Config, StateError>

/// Set a typed plugin config section.
fn set_config<K: PluginConfigKey>(&mut self, config: K::Config) -> Result<(), StateError>

ContextWindowPolicy

Controls context window management and auto-compaction.

#![allow(unused)]
fn main() {
#[derive(Default)] pub enum ContextCompactionMode { #[default] KeepRecentRawSuffix, CompactToSafeFrontier }
pub struct ContextWindowPolicy {
    pub max_context_tokens: usize,          // default: 200_000
    pub max_output_tokens: usize,           // default: 16_384
    pub min_recent_messages: usize,         // default: 10
    pub enable_prompt_cache: bool,          // default: true
    pub autocompact_threshold: Option<usize>,  // default: None
    pub compaction_mode: ContextCompactionMode, // default: KeepRecentRawSuffix
    pub compaction_raw_suffix_messages: usize,  // default: 2
}
}

ContextCompactionMode

#![allow(unused)]
fn main() {
pub enum ContextCompactionMode {
    KeepRecentRawSuffix,       // Keep N recent messages raw, compact the rest
    CompactToSafeFrontier,     // Compact everything up to safe frontier
}
}

InferenceOverride

Per-inference parameter override. All fields are Option; None means “use agent-level default”. Multiple plugins can emit overrides; fields merge with last-wins semantics.

#![allow(unused)]
fn main() {
pub enum ReasoningEffort { None, Low, Medium, High, Max, Budget(u32) }
pub struct InferenceOverride {
    pub model: Option<String>,
    pub fallback_models: Option<Vec<String>>,
    pub temperature: Option<f64>,
    pub max_tokens: Option<u32>,
    pub top_p: Option<f64>,
    pub reasoning_effort: Option<ReasoningEffort>,
}
}

Methods

fn is_empty(&self) -> bool
fn merge(&mut self, other: InferenceOverride)

ReasoningEffort

#![allow(unused)]
fn main() {
pub enum ReasoningEffort {
    None,
    Low,
    Medium,
    High,
    Max,
    Budget(u32),
}
}

PluginConfigKey trait

Binds a string key to a typed configuration struct at compile time.

pub trait PluginConfigKey: 'static + Send + Sync {
    const KEY: &'static str;
    type Config: Default + Clone + Serialize + DeserializeOwned
        + schemars::JsonSchema + Send + Sync + 'static;
}

Implementations register typed sections in AgentSpec::sections. Plugins read their configuration via agent_spec.config::<MyConfigKey>().

RemoteEndpoint

Configuration for agents running on external backends. Today Awaken ships the "a2a" backend; backend-specific settings live under options.

pub struct RemoteEndpoint {
    pub backend: String,
    pub base_url: String,
    pub auth: Option<RemoteAuth>,
    pub target: Option<String>,
    pub timeout_ms: u64,               // default: 300_000
    pub options: BTreeMap<String, Value>,
}

pub struct RemoteAuth {
    pub r#type: String,
    // backend-specific auth fields, e.g. { "token": "..." } for bearer
}

ServerConfig

HTTP server configuration. Used when the server feature is enabled.

pub struct ServerConfig {
    pub address: String,                   // default: "0.0.0.0:3000"
    pub sse_buffer_size: usize,            // default: 64
    pub replay_buffer_capacity: usize,     // default: 1024
    pub shutdown: ShutdownConfig,
    pub max_concurrent_requests: usize,    // default: 100
    pub a2a_extended_card_bearer_token: Option<String>,
}

pub struct ShutdownConfig {
    pub timeout_secs: u64,                 // default: 30
}

Crate path: awaken_server::app::ServerConfig

Field	Type	Default	Description
`address`	`String`	`"0.0.0.0:3000"`	Socket address the server binds to
`sse_buffer_size`	`usize`	`64`	Maximum SSE channel buffer size per connection
`replay_buffer_capacity`	`usize`	`1024`	Maximum SSE frames buffered per run for reconnection replay
`max_concurrent_requests`	`usize`	`100`	Maximum in-flight requests; excess requests receive 503
`a2a_extended_card_bearer_token`	`Option<String>`	`None`	Enables authenticated `GET /v1/a2a/extendedAgentCard` when set
`shutdown.timeout_secs`	`u64`	`30`	Seconds to wait for in-flight requests to drain before force-exiting

MailboxConfig

Configuration for the persistent run queue (mailbox). Controls lease timing, sweep/GC intervals, and retry behavior for failed jobs.

pub struct MailboxConfig {
    pub lease_ms: u64,                          // default: 30_000
    pub suspended_lease_ms: u64,                // default: 600_000
    pub lease_renewal_interval: Duration,       // default: 10s
    pub sweep_interval: Duration,               // default: 30s
    pub gc_interval: Duration,                  // default: 60s
    pub gc_ttl: Duration,                       // default: 24h
    pub default_max_attempts: u32,              // default: 5
    pub default_retry_delay_ms: u64,            // default: 250
    pub max_retry_delay_ms: u64,                // default: 30_000
}

Crate path: awaken_server::mailbox::MailboxConfig

Field	Type	Default	Description
`lease_ms`	`u64`	`30_000`	Lease duration in milliseconds for active runs
`suspended_lease_ms`	`u64`	`600_000`	Lease duration in milliseconds for suspended runs awaiting human input
`lease_renewal_interval`	`Duration`	`10s`	How often the worker renews its lease on a running job
`sweep_interval`	`Duration`	`30s`	How often to scan for expired leases and reclaim orphaned jobs
`gc_interval`	`Duration`	`60s`	How often to run garbage collection for terminal (completed/failed) jobs
`gc_ttl`	`Duration`	`24h`	How long terminal jobs are retained before purging
`default_max_attempts`	`u32`	`5`	Maximum delivery attempts before a job is dead-lettered
`default_retry_delay_ms`	`u64`	`250`	Base retry delay in milliseconds between attempts
`max_retry_delay_ms`	`u64`	`30_000`	Maximum retry delay in milliseconds for exponential backoff

LlmRetryPolicy

Policy for retrying failed LLM inference calls with exponential backoff and optional model fallback. Can be set per-agent via the "retry" section in AgentSpec.

pub struct LlmRetryPolicy {
    pub max_retries: u32,              // default: 2
    pub fallback_models: Vec<String>,  // default: []
    pub backoff_base_ms: u64,          // default: 500
}

Crate path: awaken_runtime::engine::retry::LlmRetryPolicy

Field	Type	Default	Description
`max_retries`	`u32`	`2`	Maximum retry attempts after the initial call (0 = no retry)
`fallback_models`	`Vec<String>`	`[]`	Model names to try in order after the primary model exhausts retries
`backoff_base_ms`	`u64`	`500`	Base delay in milliseconds for exponential backoff; actual delay = min(base * 2^attempt, 8000ms). Set to 0 to disable backoff

AgentSpec integration

use awaken_runtime::engine::retry::RetryConfigKey;

let spec = AgentSpec::new("my-agent")
    .with_config::<RetryConfigKey>(LlmRetryPolicy {
        max_retries: 3,
        fallback_models: vec!["claude-sonnet-4-20250514".into()],
        backoff_base_ms: 1000,
    })?;

CircuitBreakerConfig

Per-model circuit breaker configuration. Prevents cascading failures by short-circuiting requests to models that have experienced repeated consecutive failures. After a cooldown the circuit transitions to half-open, allowing limited probe requests before fully closing on success.

pub struct CircuitBreakerConfig {
    pub failure_threshold: u32,    // default: 5
    pub cooldown: Duration,        // default: 30s
    pub half_open_max: u32,        // default: 1
}

Crate path: awaken_runtime::engine::circuit_breaker::CircuitBreakerConfig

Field	Type	Default	Description
`failure_threshold`	`u32`	`5`	Consecutive failures before the circuit opens and rejects requests
`cooldown`	`Duration`	`30s`	How long the circuit stays open before transitioning to half-open
`half_open_max`	`u32`	`1`	Maximum probe requests allowed in the half-open state before the circuit reopens on failure or closes on success

Feature flags and their effects

Flag	Runtime behavior
`permission`	Registers the permission plugin; tools can be gated with HITL approval
`observability`	Registers the observability plugin; emits traces and metrics
`mcp`	Enables MCP tool bridge; tools from MCP servers are auto-registered
`skills`	Enables the skills subsystem for reusable agent capabilities
`server`	Builds the HTTP server with SSE streaming and protocol adapters
`generative-ui`	Enables generative UI component streaming to frontends

Custom plugin configuration

Plugins declare typed configuration sections using the PluginConfigKey trait, which binds a string key to a Rust struct at compile time:

pub trait PluginConfigKey: 'static + Send + Sync {
    const KEY: &'static str;               // section name in AgentSpec.sections
    type Config: Default + Clone + Serialize + DeserializeOwned
        + schemars::JsonSchema + Send + Sync + 'static;
}

Declaring schemas for validation

Plugins override config_schemas() to return JSON Schemas generated from their config structs. The resolve pipeline (Stage 2) validates every AgentSpec.sections entry against these schemas before any hook runs.

fn config_schemas(&self) -> Vec<ConfigSchema> {
    vec![ConfigSchema {
        key: RateLimitConfigKey::KEY,
        json_schema: schemars::schema_for!(RateLimitConfig),
    }]
}

Reading config at runtime

Plugins read their typed config via agent_spec.config::<K>(). If the section is absent, the Default impl is returned.

let cfg = ctx.agent_spec().config::<RateLimitConfigKey>()?;

Worked example

use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
use awaken::PluginConfigKey;

#[derive(Debug, Clone, Default, Serialize, Deserialize, JsonSchema)]
pub struct RateLimitConfig {
    pub max_calls_per_step: u32,   // default: 0 (unlimited)
    pub cooldown_ms: u64,          // default: 0
}

pub struct RateLimitConfigKey;

impl PluginConfigKey for RateLimitConfigKey {
    const KEY: &'static str = "rate_limit";
    type Config = RateLimitConfig;
}

// In plugin register():
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_hook(Phase::BeforeToolExecute, RateLimitHook);
    Ok(())
}

fn config_schemas(&self) -> Vec<ConfigSchema> {
    vec![ConfigSchema {
        key: RateLimitConfigKey::KEY,
        json_schema: schemars::schema_for!(RateLimitConfig),
    }]
}

// In a hook:
let cfg = ctx.agent_spec().config::<RateLimitConfigKey>()?;
if cfg.max_calls_per_step > 0 { /* enforce limit */ }

Validation behavior

Section present but invalid: resolve fails with a schema validation error.
Section present but unclaimed: a warning is logged (possible typo or removed plugin).
Section absent: allowed; the plugin receives Config::default().

Build an Agent

Errors

All error types use thiserror derives and implement std::error::Error + Display.

StateError

Errors from state management operations. Defined in awaken-contract.

#![allow(unused)]
fn main() {
use awaken::Phase;
pub enum StateError {
    RevisionConflict { expected: u64, actual: u64 },
    MutationBaseRevisionMismatch { left: u64, right: u64 },
    PluginAlreadyInstalled { name: String },
    PluginNotInstalled { type_name: &'static str },
    KeyAlreadyRegistered { key: String },
    UnknownKey { key: String },
    KeyDecode { key: String, message: String },
    KeyEncode { key: String, message: String },
    HandlerAlreadyRegistered { key: String },
    EffectHandlerAlreadyRegistered { key: String },
    PhaseRunLoopExceeded { phase: Phase, max_rounds: usize },
    UnknownScheduledActionHandler { key: String },
    UnknownEffectHandler { key: String },
    ParallelMergeConflict { key: String },
    ToolAlreadyRegistered { tool_id: String },
    Cancelled,
}
}

Crate path: awaken::StateError

StateError implements Clone and PartialEq.

ToolError

Errors returned from Tool::validate_args or Tool::execute. A ToolError aborts the tool call entirely (as opposed to ToolResult::error, which sends the failure back to the LLM).

#![allow(unused)]
fn main() {
pub enum ToolError {
    InvalidArguments(String),
    ExecutionFailed(String),
    Denied(String),
    NotFound(String),
    Internal(String),
}
}

Crate path: awaken::contract::tool::ToolError

BuildError

Errors from AgentRuntimeBuilder::build().

#![allow(unused)]
fn main() {
use awaken::StateError;
struct DiscoveryError;
pub enum BuildError {
    State(StateError),
    AgentRegistryConflict(String),
    ToolRegistryConflict(String),
    ModelRegistryConflict(String),
    ProviderRegistryConflict(String),
    PluginRegistryConflict(String),
    ValidationFailed(String),
    DiscoveryFailed(DiscoveryError),     // requires feature "a2a"
}
}

Crate path: awaken::BuildError

BuildError converts from StateError via From.

RuntimeError

Errors from agent runtime operations (resolving agents, starting runs).

#![allow(unused)]
fn main() {
use awaken::StateError;
pub enum RuntimeError {
    State(StateError),
    ThreadAlreadyRunning { thread_id: String },
    AgentNotFound { agent_id: String },
    ResolveFailed { message: String },
}
}

Crate path: awaken::RuntimeError

RuntimeError converts from StateError via From. Implements Clone and PartialEq.

InferenceExecutionError

Errors from the LLM execution layer.

#![allow(unused)]
fn main() {
pub enum InferenceExecutionError {
    Provider(String),
    RateLimited(String),
    Timeout(String),
    Cancelled,
}
}

Crate path: awaken::contract::executor::InferenceExecutionError

StorageError

Errors returned by ThreadStore, RunStore, and ThreadRunStore operations.

#![allow(unused)]
fn main() {
pub enum StorageError {
    NotFound(String),
    AlreadyExists(String),
    VersionConflict { expected: u64, actual: u64 },
    Io(String),
    Serialization(String),
}
}

Crate path: awaken::contract::storage::StorageError

ResolveError

Errors from the agent resolution pipeline (resolving AgentSpec to a runnable ResolvedAgent).

#![allow(unused)]
fn main() {
use awaken::StateError;
pub enum ResolveError {
    AgentNotFound(String),
    ModelNotFound(String),
    ProviderNotFound(String),
    PluginNotFound(String),
    InvalidPluginConfig { plugin: String, key: String, message: String },
    RemoteAgentNotDirectlyRunnable(String),
    ToolIdConflict { tool_id: String, source_a: String, source_b: String },
    EnvBuild(StateError),
}
}

Crate path: awaken::registry::resolve::ResolveError

UnknownKeyPolicy

Controls behavior when encountering an unknown state key during deserialization.

#![allow(unused)]
fn main() {
pub enum UnknownKeyPolicy {
    Error,
    Skip,
}
}

Crate path: awaken::UnknownKeyPolicy

Tool Trait Reference

AI SDK v6 Protocol

The AI SDK v6 adapter translates Awaken’s internal AgentEvent stream into the Vercel AI SDK v6 UI Message Stream format. This allows any AI SDK-compatible frontend (useChat, useAssistant) to consume agent output without custom parsing.

Endpoint

POST /v1/ai-sdk/chat

Request Body

{
  "messages": [{ "role": "user", "content": "Hello" }],
  "threadId": "optional-thread-id",
  "agentId": "optional-agent-id"
}

Field	Type	Required	Description
`messages`	`AiSdkMessage[]`	yes	Chat messages. Content may be a string or an array of content parts.
`threadId`	`string`	no	Existing thread to continue. Omit to create a new thread.
`agentId`	`string`	no	Target agent. Uses the default agent when omitted.

Response

SSE stream (text/event-stream). Each line is a JSON-encoded UIStreamEvent.

Auxiliary Routes

Route	Method	Description
`/v1/ai-sdk/streams/:run_id`	GET	Reconnect to an active run’s SSE stream.
`/v1/ai-sdk/runs/:run_id/stream`	GET	Alias for stream reconnect.
`/v1/ai-sdk/threads/:id/messages`	GET	Retrieve thread message history.

Event Mapping

The AiSdkEncoder is a stateful transcoder that converts AgentEvent variants into UIStreamEvent variants. It tracks open text blocks and reasoning blocks across tool-call boundaries.

AgentEvent	UIStreamEvent(s)
`RunStart`	`MessageStart` + `Data("run-info", ...)`
`TextDelta`	`TextStart` (if block not open) + `TextDelta`
`ReasoningDelta`	`ReasoningStart` (if block not open) + `ReasoningDelta`
`ReasoningEncryptedValue`	`ReasoningStart` (if not open) + `ReasoningDelta`
`ToolCallStart`	Close open text/reasoning blocks, then `ToolCallStart`
`ToolCallDelta`	`ToolCallDelta`
`ToolCallDone`	`ToolCallEnd`
`StepStart`	(no direct mapping)
`StepEnd`	(no direct mapping)
`InferenceComplete`	`Data("inference-complete", ...)`
`MessagesSnapshot`	`Data("messages-snapshot", ...)`
`StateSnapshot`	`Data("state-snapshot", ...)`
`StateDelta`	`Data("state-delta", ...)`
`ActivitySnapshot`	`Data("activity-snapshot", ...)`
`ActivityDelta`	`Data("activity-delta", ...)`
`RunFinish`	Close open blocks, `Data("finish", ...)`, `Finish`

UIStreamEvent Types

The wire format uses the type field as a discriminant, serialized in kebab-case:

start – message lifecycle start, carries optional messageId and messageMetadata
text-start, text-delta, text-end – text block lifecycle with content ID
reasoning-start, reasoning-delta, reasoning-end – reasoning block lifecycle
tool-call-start, tool-call-delta, tool-call-end – tool call lifecycle
data – arbitrary named data events (state snapshots, activity, inference metadata)
finish – terminal event with finish reason and usage summary

Text Block Lifecycle

The encoder automatically manages text block open/close boundaries:

First TextDelta opens a text block (TextStart).
Subsequent deltas append to the open block.
When a ToolCallStart arrives, the encoder closes any open text or reasoning block before emitting tool events.
After tool execution completes, new text deltas open a fresh block with an incremented ID.

This ensures the frontend receives well-formed block boundaries even though the runtime emits flat delta events.

Events – full AgentEvent enum
HTTP API – server configuration
Integrate AI SDK Frontend – frontend integration guide

AG-UI Protocol

The AG-UI adapter translates Awaken’s AgentEvent stream into the AG-UI (CopilotKit) event format. This enables CopilotKit frontends to drive Awaken agents with no custom adapter code.

Endpoint

POST /v1/ag-ui/run

Request Body

{
  "threadId": "optional-thread-id",
  "agentId": "optional-agent-id",
  "messages": [{ "role": "user", "content": "Hello" }],
  "context": {}
}

Field	Type	Required	Description
`messages`	`AgUiMessage[]`	yes	Chat messages with `role` and `content` strings.
`threadId`	`string`	no	Existing thread. Omit to create a new thread.
`agentId`	`string`	no	Target agent. Uses the default agent when omitted.
`context`	`object`	no	CopilotKit context forwarding (reserved).

Response

SSE stream (text/event-stream). Each frame is a JSON-encoded AG-UI Event.

Auxiliary Routes

Route	Method	Description
`/v1/ag-ui/threads/:id/messages`	GET	Retrieve thread message history.

Event Mapping

The AgUiEncoder is a stateful transcoder that manages text message and step lifecycles.

AgentEvent	AG-UI Event(s)
`RunStart`	`RUN_STARTED`
`TextDelta`	`TEXT_MESSAGE_START` (if not open) + `TEXT_MESSAGE_CONTENT`
`ReasoningDelta`	`REASONING_MESSAGE_START` (if not open) + `REASONING_MESSAGE_CONTENT`
`ToolCallStart`	Close text/reasoning, `STEP_STARTED`, `TOOL_CALL_START`
`ToolCallDelta`	`TOOL_CALL_ARGS`
`ToolCallDone`	`TOOL_CALL_END`, `STEP_FINISHED`
`StateSnapshot`	`STATE_SNAPSHOT`
`StateDelta`	`STATE_DELTA`
`RunFinish` (success)	Close text/reasoning, `RUN_FINISHED`
`RunFinish` (error)	Close text/reasoning, `RUN_ERROR`

AG-UI Event Types

Events use an uppercase type discriminant:

RUN_STARTED / RUN_FINISHED / RUN_ERROR – run lifecycle
TEXT_MESSAGE_START / TEXT_MESSAGE_CONTENT / TEXT_MESSAGE_END – assistant text
REASONING_MESSAGE_START / REASONING_MESSAGE_CONTENT / REASONING_MESSAGE_END – reasoning trace
STEP_STARTED / STEP_FINISHED – step boundaries (wrapping tool calls)
TOOL_CALL_START / TOOL_CALL_ARGS / TOOL_CALL_END – tool call lifecycle
STATE_SNAPSHOT / STATE_DELTA – shared state synchronization
MESSAGES_SNAPSHOT – full thread message history

All events carry a BaseEvent with optional timestamp and rawEvent fields.

Roles

AG-UI messages use lowercase role strings: system, user, assistant, tool.

Text Message Lifecycle

First TextDelta emits TEXT_MESSAGE_START followed by TEXT_MESSAGE_CONTENT.
Subsequent deltas emit only TEXT_MESSAGE_CONTENT.
A ToolCallStart or RunFinish closes the open message with TEXT_MESSAGE_END.

Reasoning messages follow the same pattern with REASONING_MESSAGE_* events.

Events – full AgentEvent enum
Integrate CopilotKit (AG-UI) – frontend integration guide

A2A Protocol

The Agent-to-Agent (A2A) adapter implements the A2A protocol for remote agent discovery, task delegation, and inter-agent communication.

Feature gate: server

Endpoints

Route	Method	Description
`/.well-known/agent-card.json`	GET	Discovery endpoint for the public/default agent card.
`/v1/a2a/message:send`	POST	Send a message to the public/default A2A agent. Returns a task wrapper.
`/v1/a2a/message:stream`	POST	Streaming send over SSE.
`/v1/a2a/tasks`	GET	List A2A tasks.
`/v1/a2a/tasks/:task_id`	GET	Poll task status by ID.
`/v1/a2a/tasks/:task_id:cancel`	POST	Cancel a running task.
`/v1/a2a/tasks/:task_id:subscribe`	POST	Subscribe to task updates over SSE.
`/v1/a2a/tasks/:task_id/pushNotificationConfigs`	POST	Create a push notification config.
`/v1/a2a/tasks/:task_id/pushNotificationConfigs`	GET	List push notification configs.
`/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_id`	GET / DELETE	Read or delete a push notification config.
`/v1/a2a/extendedAgentCard`	GET	Extended agent card. Returns `501` unless `capabilities.extendedAgentCard=true`.

Tenant-scoped variants mirror the same interface under /v1/a2a/:tenant/..., for example /v1/a2a/research/message:send and /v1/a2a/research/tasks/:task_id.

Agent Card

The discovery endpoint returns an AgentCard describing the exposed interface and capabilities:

{
  "name": "My Agent",
  "description": "A helpful assistant",
  "supportedInterfaces": [
    {
      "url": "https://example.com/v1/a2a",
      "protocolBinding": "HTTP+JSON",
      "protocolVersion": "1.0"
    }
  ],
  "version": "1.0.0",
  "capabilities": {
    "streaming": true,
    "pushNotifications": true,
    "stateTransitionHistory": false,
    "extendedAgentCard": false
  },
  "defaultInputModes": ["text/plain"],
  "defaultOutputModes": ["text/plain"],
  "skills": [
    {
      "id": "general",
      "name": "General Q&A",
      "description": "Answer general questions",
      "tags": ["qa"],
      "inputModes": ["text/plain"],
      "outputModes": ["text/plain"]
    }
  ]
}

Agent cards are derived from registered AgentSpec entries. The top-level legacy url/id fields are not emitted.

Message Send

{
  "message": {
    "taskId": "optional-client-provided-id",
    "contextId": "optional-client-provided-id",
    "messageId": "msg-123",
    "role": "ROLE_USER",
    "parts": [{ "text": "Summarize this document" }]
  },
  "configuration": {
    "returnImmediately": true
  }
}

The server maps A2A tasks to Awaken thread/mailbox execution. The response uses the v1 task wrapper shape:

{
  "task": {
    "id": "optional-client-provided-id",
    "contextId": "optional-client-provided-id",
    "status": {
      "state": "TASK_STATE_SUBMITTED"
    }
  }
}

If returnImmediately is omitted or false, the adapter waits for a terminal/interrupted task state before responding.

Task Status

GET /v1/a2a/tasks/:task_id returns a Task resource:

{
  "id": "abc-123",
  "contextId": "abc-123",
  "status": {
    "state": "TASK_STATE_COMPLETED",
    "message": {
      "messageId": "msg-response",
      "role": "ROLE_AGENT",
      "parts": [{ "text": "..." }]
    }
  },
  "history": []
}

Task states follow the v1 enum names such as TASK_STATE_SUBMITTED, TASK_STATE_WORKING, TASK_STATE_COMPLETED, TASK_STATE_FAILED, and TASK_STATE_CANCELED.

Optional capability defaults

Awaken currently enables these A2A capabilities by default:

streaming = true
pushNotifications = true

extendedAgentCard remains opt-in and is enabled only when ServerConfig.a2a_extended_card_bearer_token is configured. When disabled, the extended card endpoints return spec-shaped unsupported errors.

Remote Agent Delegation

Awaken agents can delegate to remote A2A agents via AgentTool::remote(). The A2aBackend sends a message:send request to the remote endpoint, reads the returned task.id, then polls /tasks/:task_id for completion. From the LLM’s perspective, this is a regular tool call — the A2A transport is transparent.

Configuration for remote agents is declared in AgentSpec. RemoteEndpoint is generic, and A2A uses backend: "a2a":

{
  "id": "remote-researcher",
  "endpoint": {
    "backend": "a2a",
    "base_url": "https://remote-agent.example.com/v1/a2a",
    "auth": { "type": "bearer", "token": "..." },
    "target": "researcher",
    "timeout_ms": 300000,
    "options": {
      "poll_interval_ms": 1000
    }
  }
}

Agents with an endpoint field are resolved as remote backend agents. Today the built-in delegate backend is A2A. Agents without endpoint run locally.

Multi-Agent Patterns — delegation and handoff design
A2A Specification — official protocol reference

Cancellation

Cooperative cancellation token used to interrupt agent runs, streaming loops, and long-running operations.

CancellationToken

A cloneable handle backed by a shared AtomicBool and a tokio::sync::Notify. All clones share the same cancellation state – cancelling any clone cancels all of them.

use awaken::CancellationToken;

let token = CancellationToken::new();

Crate path: awaken_contract::cancellation::CancellationToken

Methods

/// Create a new uncancelled token.
pub fn new() -> Self

/// Signal cancellation. Wakes all async waiters immediately.
pub fn cancel(&self)

/// Check if cancellation has been requested (synchronous poll).
pub fn is_cancelled(&self) -> bool

/// Wait until cancellation is signalled. Resolves immediately if already cancelled.
pub async fn cancelled(&self)

Traits

Clone – clones share the same underlying state
Default – creates an uncancelled token (equivalent to new())

Synchronous polling

Use is_cancelled() to check cancellation from synchronous code or tight loops:

let token = CancellationToken::new();

while !token.is_cancelled() {
    // do work
}

Async waiting with tokio::select!

Use cancelled() in a tokio::select! branch to interrupt async operations without polling:

let token = CancellationToken::new();

tokio::select! {
    result = some_async_work() => {
        // work completed before cancellation
    }
    _ = token.cancelled() => {
        // cancellation was signalled
    }
}

The cancelled() future resolves immediately if the token is already cancelled, so there is no race between checking and waiting.

Cooperative semantics

Cancellation is cooperative. Calling cancel() sets a flag and wakes async waiters, but does not abort any running task. Code must check is_cancelled() or select! on cancelled() to observe and respond to cancellation.

Key properties:

Idempotent – calling cancel() multiple times is safe and has no additional effect.
Shared – all clones observe the same cancellation state. Cancelling from any clone is visible to all others.
Ordering – uses SeqCst ordering on the atomic flag, so cancellation is immediately visible across threads.
Immediate wake – cancel() calls Notify::notify_waiters(), waking all tasks blocked on cancelled() without waiting for the next poll.

Runtime usage

The runtime passes a CancellationToken into each agent run. It is used to:

Interrupt streaming inference mid-response when the caller requests cancellation.
Stop the agent loop between inference cycles.
Propagate cancellation from external transports (HTTP, SSE) into the runtime.

A typical streaming loop checks cancellation alongside each chunk:

let token = CancellationToken::new();
let clone = token.clone();

tokio::select! {
    _ = async {
        while let Some(chunk) = stream.next().await {
            // process chunk
        }
    } => {}
    _ = clone.cancelled() => {
        // stop processing, clean up
    }
}

Auto-cancellation on new message

When a new message is submitted to a thread that already has an active run (e.g. a run suspended while waiting for tool approval), the runtime automatically cancels the old run before starting the new one.

This prevents the ThreadAlreadyRunning error and avoids infinite retry loops. The sequence is:

Mailbox::submit() detects an active run on the thread.
Calls AgentRuntime::cancel_and_wait_by_thread() which signals the CancellationToken and waits (up to 5 seconds) for the RunSlotGuard to drop and free the thread slot.
The old run emits RunFinish with TerminationReason::Cancelled.
Before the new run starts inference, strip_unpaired_tool_calls() removes any orphaned tool calls from the message history that were left by the cancelled run (assistant messages with tool_calls that have no matching Tool role response).

This ensures clean handoff between runs without leaving dangling state that would confuse the LLM.

Key Files

crates/awaken-runtime/src/cancellation.rs – CancellationToken implementation
crates/awaken-runtime/src/runtime/agent_runtime/active_registry.rs – run tracking with completion notification
crates/awaken-runtime/src/runtime/agent_runtime/runner.rs – strip_unpaired_tool_calls()

Tool Execution Modes

ToolExecutionMode controls how the runtime executes tool calls that the LLM requests in a single inference step.

Crate path: awaken::contract::executor::ToolExecutionMode

ToolExecutionMode

#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
pub enum ToolExecutionMode {
    #[default]
    Sequential,
    ParallelBatchApproval,
    ParallelStreaming,
}

The default is Sequential.

Modes

Sequential

Executes tool calls one at a time in the order the LLM returned them.

The state snapshot is refreshed between calls (requires_incremental_state returns true), so each tool sees the effects of the previous one.
Stops at the first suspension. If tool call 2 of 4 suspends, tool calls 3 and 4 are not executed. Failures do not stop execution.
Simplest mode. Use when tool calls have data dependencies or when ordering matters.

// SequentialToolExecutor runs calls in order, stopping at first suspension.
let executor = SequentialToolExecutor;
let results = executor.execute(&tools, &calls, &ctx).await?;

ParallelBatchApproval

Executes all tool calls concurrently. All tools see the same frozen state snapshot.

Suspension decisions are replayed using DecisionReplayPolicy::BatchAllSuspended: the runtime waits until every suspended call has a decision before replaying any of them.
Enforces parallel patch conflict checks (requires_conflict_check returns true).
Does not stop on suspension or failure; all calls run to completion.
Use when tool calls are independent and you want an all-or-nothing approval gate for HITL workflows.

let executor = ParallelToolExecutor::batch_approval();
// decision_replay_policy() == DecisionReplayPolicy::BatchAllSuspended

ParallelStreaming

Executes all tool calls concurrently with the same frozen state snapshot.

Suspension decisions are replayed using DecisionReplayPolicy::Immediate: each decision is replayed as soon as it arrives, without waiting for the others.
Enforces parallel patch conflict checks.
Does not stop on suspension or failure; all calls run to completion.
Use when tool calls are independent and you want the fastest end-to-end completion without batching approval.

let executor = ParallelToolExecutor::streaming();
// decision_replay_policy() == DecisionReplayPolicy::Immediate

Comparison

Behavior	Sequential	ParallelBatchApproval	ParallelStreaming
Execution order	One at a time	All concurrently	All concurrently
State freshness	Refreshed between calls	Frozen snapshot	Frozen snapshot
Stops on suspension	Yes (first suspension)	No	No
Stops on failure	No	No	No
Decision replay	N/A	Batch (all at once)	Immediate (one by one)
Conflict checks	No	Yes	Yes

Executor trait

Both SequentialToolExecutor and ParallelToolExecutor implement the ToolExecutor trait:

#[async_trait]
pub trait ToolExecutor: Send + Sync {
    /// Execute tool calls and return results.
    async fn execute(
        &self,
        tools: &HashMap<String, Arc<dyn Tool>>,
        calls: &[ToolCall],
        base_ctx: &ToolCallContext,
    ) -> Result<Vec<ToolExecutionResult>, ToolExecutorError>;

    /// Strategy name for logging.
    fn name(&self) -> &'static str;

    /// Whether the executor needs state refreshed between individual tool calls.
    fn requires_incremental_state(&self) -> bool { false }
}

The name() values are "sequential", "parallel_batch_approval", and "parallel_streaming".

DecisionReplayPolicy

Controls when resume decisions for suspended tool calls are replayed into the execution pipeline. Only relevant for parallel modes.

pub enum DecisionReplayPolicy {
    /// Replay each resolved suspended call as soon as its decision arrives.
    Immediate,
    /// Replay only when all currently suspended calls have decisions.
    BatchAllSuspended,
}

Key Files

crates/awaken-contract/src/contract/executor.rs – ToolExecutionMode enum
crates/awaken-runtime/src/execution/executor.rs – SequentialToolExecutor, ParallelToolExecutor, ToolExecutor trait

Tool Trait
HITL and Mailbox
Events

Architecture

Awaken is organized around one runtime core plus three surrounding surfaces: contract types, server/storage adapters, and optional extensions. The important distinction is not just crate boundaries, but where decisions are made.

Application assembly
  register tools / models / providers / plugins / AgentSpec
        |
        v
AgentRuntime
  resolve AgentSpec -> ResolvedAgent
  build ExecutionEnv from plugins
  run the phase loop
  expose cancel / decision control for active runs
        |
        v
Server + storage surfaces
  HTTP routes, mailbox, SSE replay, protocol adapters,
  thread/run persistence, profile storage

Contract layer – awaken-contract defines the shared types used everywhere: AgentSpec, ModelSpec, ProviderSpec, Tool, AgentEvent, transport traits, and the typed state model. This is the vocabulary that the rest of the system speaks.

Runtime core – awaken-runtime is the orchestration layer. It resolves agent IDs to fully wired configurations (ResolvedAgent), builds an ExecutionEnv from plugins, manages active runs, and delegates execution to the loop runner plus phase engine.

Server and persistence surfaces – awaken-server turns the runtime into HTTP and SSE endpoints, mailbox-backed background execution, and protocol adapters. awaken-stores provides concrete persistence backends for threads and runs. awaken-ext-* crates extend the runtime at phase and tool boundaries without changing the core loop.

Request Sequence

The following diagram shows a representative request flowing through the system:

sequenceDiagram
    participant Client
    participant Server
    participant Runtime
    participant LLM
    participant Tool

    Client->>Server: POST /v1/ai-sdk/chat
    Server->>Runtime: RunRequest (agent_id, thread_id, messages)
    Runtime->>Runtime: Resolve agent (AgentSpec -> ResolvedAgent)
    Runtime->>Runtime: Load thread history
    Runtime->>Runtime: RunStart phase
    loop Step loop
        Runtime->>Runtime: StepStart phase
        Runtime->>Runtime: BeforeInference phase
        Runtime->>LLM: Inference request (messages + tools)
        LLM-->>Runtime: Response (text + tool_calls)
        Runtime->>Runtime: AfterInference phase
        opt Tool calls present
            Runtime->>Runtime: BeforeToolExecute phase
            Runtime->>Tool: execute(args, ctx)
            Tool-->>Runtime: ToolResult
            Runtime->>Runtime: AfterToolExecute phase
        end
        Runtime->>Runtime: StepEnd phase (checkpoint)
    end
    Runtime->>Runtime: RunEnd phase
    Runtime-->>Server: AgentEvent stream
    Server-->>Client: SSE events (protocol-specific encoding)

Phase-Driven Execution Loop

Every run proceeds through a fixed sequence of phases. Plugins register hooks that run at each phase boundary, giving them control over inference parameters, tool execution, state mutations, and termination logic.

RunStart -> [StepStart -> BeforeInference -> AfterInference
             -> BeforeToolExecute -> AfterToolExecute -> StepEnd]* -> RunEnd

The step loop repeats until one of these conditions fires:

The LLM returns a response with no tool calls (NaturalEnd).
A plugin or stop condition requests termination (Stopped, BehaviorRequested).
A tool call suspends waiting for external input (Suspended).
The run is cancelled externally (Cancelled).
An error occurs (Error).

At each phase boundary, the loop checks the cancellation token and the run lifecycle state before proceeding.

Repository Map

awaken
├─ awaken-contract
│  ├─ registry specs
│  ├─ tool / executor / event / transport contracts
│  └─ state model
├─ awaken-runtime
│  ├─ builder + registries + resolve pipeline
│  ├─ AgentRuntime control plane
│  ├─ loop_runner + phase engine
│  ├─ execution / context / policies / profile
│  └─ runtime extensions (handoff, local A2A, background)
├─ awaken-server
│  ├─ routes + config API + mailbox + services
│  ├─ protocols: ai_sdk_v6 / ag_ui / a2a / mcp / acp-stdio
│  └─ transport: SSE relay / replay buffer / transcoder
├─ awaken-stores
└─ awaken-ext-*

Design Intent

Three principles guide the architecture:

Snapshot isolation – Phase hooks never see partially applied state. They read from an immutable snapshot and write to a MutationBatch. The batch is applied atomically after all hooks for a phase have converged. This eliminates data races between concurrent hooks and makes hook execution order irrelevant for correctness.

Append-style persistence – Thread messages are append-only. State is checkpointed at step boundaries. This makes it possible to replay a run from any checkpoint and produces a deterministic audit trail.

Transport independence – The runtime emits AgentEvent values through an EventSink trait. Protocol adapters (AiSdkEncoder, AgUiEncoder) transcode these events into wire formats. The runtime has no knowledge of HTTP, SSE, or any specific protocol. Adding a new protocol means implementing a new encoder – the runtime does not change.

Agent Resolution

When runtime.run(request) is called, the agent_id in the request must be resolved into a fully wired ResolvedAgent – a struct that holds live references to an LLM executor, tools, plugins, and an execution environment. This resolution happens on every resolve() call; nothing is cached or shared between runs. This page describes the three-stage resolution pipeline and the builder that feeds it.

Pipeline Overview

Resolution is a pure function: (RegistrySet, agent_id) -> ResolvedAgent. It proceeds through three sequential stages:

flowchart LR
    subgraph Stage1["Stage 1: Lookup"]
        A1[Fetch AgentSpec] --> A2[Resolve ModelSpec]
        A2 --> A3[Get LlmExecutor]
        A3 --> A4{Retry config?}
        A4 -- yes --> A5[Wrap in RetryingExecutor]
        A4 -- no --> A6[Use executor directly]
    end

    subgraph Stage2["Stage 2: Plugin Pipeline"]
        B1[Resolve plugin IDs] --> B2[Inject default plugins]
        B2 --> B3{context_policy set?}
        B3 -- yes --> B4[Add CompactionPlugin + ContextTransformPlugin]
        B3 -- no --> B5[Skip]
        B4 --> B6[Validate config sections]
        B5 --> B6
        B6 --> B7[Build ExecutionEnv]
    end

    subgraph Stage3["Stage 3: Tool Pipeline"]
        C1[Collect global tools] --> C2[Merge delegate agent tools]
        C2 --> C3[Merge plugin-registered tools]
        C3 --> C4[Check for ID conflicts]
        C4 --> C5[Apply allowed/excluded filters]
    end

    Stage1 --> Stage2 --> Stage3

Any failure at any stage produces a ResolveError and aborts. The pipeline never returns a partial result.

Stage 1: Lookup

The first stage fetches the raw data from registries:

AgentSpec – looked up from AgentSpecRegistry by agent_id. If the spec has an endpoint field (remote backend agent), resolution fails with RemoteAgentNotDirectlyRunnable – remote agents can only be used as delegates, not run directly.
ModelSpec – the spec’s model field (a string ID like "gpt-4") is resolved through ModelRegistry into a ModelSpec, which maps it to a provider ID and an actual API model name (for example, provider "openai", model "gpt-4o").
LlmExecutor – the provider ID from the model entry is resolved through ProviderRegistry to get a live LlmExecutor instance.
Retry decoration – if the agent spec contains a RetryConfigKey section with max_retries > 0 or non-empty fallback_models, the executor is wrapped in a RetryingExecutor decorator.

Stage 2: Plugin Pipeline

The second stage assembles the plugin chain and builds the execution environment.

Plugin resolution

Plugins listed in AgentSpec.plugin_ids are resolved by ID from PluginSource. A missing plugin produces ResolveError::PluginNotFound.

Default plugin injection

After resolving user-declared plugins, the pipeline injects runtime-required default plugins. These are always present regardless of agent configuration:

LoopActionHandlersPlugin – registers the core action handlers that the runtime loop uses to process tool calls, emit events, and manage step transitions. Without this plugin, the loop cannot function.
MaxRoundsPlugin – enforces the max_rounds stop condition configured on the agent spec. Injected with the spec’s max_rounds value. Prevents runaway loops.

Conditional plugins

These plugins are added only when AgentSpec.context_policy is set:

CompactionPlugin – manages context window compaction (summarization of old messages when the context grows too large). Created with the CompactionConfigKey section from the spec, falling back to defaults if absent.
ContextTransformPlugin – applies context window policy transforms (token counting, truncation, prompt caching) before each inference request. Created with the context_policy value.

Building ExecutionEnv

After the plugin list is finalized, ExecutionEnv::from_plugins() calls each plugin’s register() method with a PluginRegistrar. Plugins use the registrar to declare:

Phase hooks (per-phase callbacks)
Scheduled action handlers
Effect handlers
Request transforms
State key registrations
Tools

The result is an ExecutionEnv – see ExecutionEnv below.

Config validation

Plugins can declare config_schemas(), returning a list of ConfigSchema entries. Each entry associates a section key with a JSON Schema. During resolution, every declared schema is validated against the corresponding entry in AgentSpec.sections:

Section present – validated against the JSON Schema. Failure produces ResolveError::InvalidPluginConfig.
Section absent – allowed. Plugins are expected to use sensible defaults.
Section present but unclaimed – no plugin declared a schema for it. The pipeline logs a warning (possible typo in configuration).

Stage 3: Tool Pipeline

The third stage collects tools from all sources and produces the final tool set.

Tool sources

Tools are merged in this order:

Global tools – all tools registered in ToolRegistry via the builder (e.g., builder.with_tool("search", search_tool)).
Delegate agent tools – for each agent ID in AgentSpec.delegates, the pipeline creates an AgentTool. If the delegate has an endpoint (remote), the pipeline selects the configured remote backend. Today the built-in remote delegate backend is A2A; local delegates still use a resolver-backed local tool. Delegate tools require the a2a feature flag; without it, delegates are silently ignored with a warning.
Plugin-registered tools – tools declared by plugins during register(), stored in ExecutionEnv.tools.

Conflict detection

If a plugin-registered tool has the same ID as a global tool, resolution fails with ResolveError::ToolIdConflict. This is intentional – silent overwriting would be a source of hard-to-debug issues.

Filtering

After merging, the spec’s allowed_tools and excluded_tools fields are applied:

allowed_tools = None – all tools are kept.
allowed_tools = Some(list) – only tools whose ID appears in the list are kept. Everything else is dropped.
excluded_tools – any tool whose ID appears in this list is removed, even if it was in the allow list.

ExecutionEnv

ExecutionEnv is the per-resolve product of the plugin pipeline. It is not global or shared – each resolve() call builds a fresh one. Its contents:

Field	Type	Purpose
`phase_hooks`	`HashMap<Phase, Vec<TaggedPhaseHook>>`	Hooks invoked at each phase boundary
`scheduled_action_handlers`	`HashMap<String, ScheduledActionHandlerArc>`	Named handlers for scheduled/deferred actions
`effect_handlers`	`HashMap<String, EffectHandlerArc>`	Named handlers for side effects
`request_transforms`	`Vec<TaggedRequestTransform>`	Transforms applied to inference requests before the LLM call
`key_registrations`	`Vec<KeyRegistration>`	State keys to install into the state store at run start
`tools`	`HashMap<String, Arc<dyn Tool>>`	Plugin-provided tools (merged into the main tool set in Stage 3)
`plugins`	`Vec<Arc<dyn Plugin>>`	Plugin references for lifecycle hooks (`on_activate`/`on_deactivate`)

Each TaggedPhaseHook and TaggedRequestTransform carries its owning plugin ID for diagnostics and filtering.

AgentRuntimeBuilder

The builder (AgentRuntimeBuilder) is the standard way to construct an AgentRuntime. It accumulates five registries:

Registry	Builder method	Purpose
`MapAgentSpecRegistry`	`with_agent_spec()` / `with_agent_specs()`	Agent definitions
`MapToolRegistry`	`with_tool()`	Global tools
`MapModelRegistry`	`with_model()`	Model ID to provider + model name mappings
`MapProviderRegistry`	`with_provider()`	LLM executor instances
`MapPluginSource`	`with_plugin()`	Plugin instances

Error handling

The builder uses deferred error collection. Each with_* call that detects a conflict (duplicate ID) pushes a BuildError onto an internal error list instead of returning Result. The first collected error surfaces when build() or build_unchecked() is called.

Validation

build() performs a dry-run resolve for every registered agent spec after constructing the runtime. If any agent fails to resolve (missing model, missing provider, missing plugin), the error is collected and returned as BuildError::ValidationFailed. This catches configuration errors at startup rather than at first request.

build_unchecked() skips this validation. Use it only when you need lazy resolution or when agents will be added dynamically after construction.

Remote agents (A2A)

When the a2a feature is enabled, the builder supports with_remote_agents() to register remote A2A endpoints. These are wrapped in a CompositeAgentSpecRegistry that combines local and remote agent sources. Remote agents are discovered asynchronously via build_and_discover().

State Management

Awaken provides four layers of state management, each designed for a different combination of scope, access pattern, and lifecycle. This page explains when and how to use each layer.

Overview

Layer	Trait	Scope	Access	Lifecycle	Primary Use Case
Run State	`StateKey` (`KeyScope::Run`)	Current run only	Sync (snapshot)	Cleared at run start	Transient counters, flags, step state
Thread State	`StateKey` (`KeyScope::Thread`)	Same thread, cross-run	Sync (snapshot)	Auto-exported/restored across runs	Tool call state, active agent, permissions
Shared State	`ProfileKey` + `StateScope`	Dynamic (global, parent thread, agent type, custom)	Async (`ProfileAccess`)	Persistent in `ProfileStore`	Cross-boundary sharing, global config
Profile State	`ProfileKey` + `key: &str`	Per-key (agent/system)	Async (`ProfileAccess`)	Persistent in `ProfileStore`	User/agent preferences, locale

Run State

Run state is the most transient layer. It lives entirely in memory, is accessed synchronously through a Snapshot, and is cleared to its default value when a new run begins.

Writes happen through MutationBatch, which collects updates produced by phase hooks. When multiple hooks run in parallel, the runtime uses MergeStrategy to determine whether concurrent writes to the same key can be safely merged (Commutative) or must be serialized (Exclusive). This makes run state the only layer that participates in the transactional merge protocol.

Typical examples include RunLifecycle (tracking the current run phase), PendingWorkKey (counting outstanding async work), and ContextThrottleState (rate-limiting context injection).

When to use

Per-inference temporary state that does not need to survive beyond the current run
State that must participate in parallel merge (MutationBatch with MergeStrategy)
Counters, flags, and step-tracking metadata

Example

struct StepCounter;
impl StateKey for StepCounter {
    const KEY: &'static str = "step_counter";
    type Value = usize;
    type Update = usize;
    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

// Register in plugin
r.register_key::<StepCounter>(StateKeyOptions::default())?;

// Read via snapshot
let count = ctx.snapshot.get::<StepCounter>().copied().unwrap_or(0);

// Write via MutationBatch
cmd.update::<StepCounter>(1);

Thread State

Thread state shares the same access model as run state – sync reads via Snapshot, transactional writes via MutationBatch, merge-safe through MergeStrategy. The difference is lifecycle: thread-scoped keys persist across runs on the same thread.

The runtime handles this transparently. At the end of a run, thread-scoped keys are exported (serialized). When the next run starts on the same thread, they are restored to their previous values instead of being reset to defaults. From the hook author’s perspective, the key simply “remembers” its value between runs.

Typical examples include ToolCallStates (for resuming suspended tool calls across runs) and ActiveAgentKey (for persisting agent handoff state).

When to use

State that must survive across runs on the same thread
State that needs sync access and transactional merge guarantees within each run
State whose lifecycle should be managed automatically by the runtime

Example

r.register_key::<ToolCallStates>(StateKeyOptions {
    scope: KeyScope::Thread,
    persistent: true,
    ..StateKeyOptions::default()
})?;

Shared State

Shared state is a persistent, async layer built on the ProfileStore backend. It is designed for data that must cross thread and agent boundaries – something neither run state nor thread state can do.

Shared state uses ProfileKey to bind a compile-time namespace to a value type, and a key: &str parameter to identify the runtime instance. Together, (ProfileKey::KEY, key) uniquely identifies a shared state entry. Different agents and threads can read and write the same entry if they use the same key string. The same ProfileAccess methods (read, write, delete) serve both shared and profile state — they all take key: &str.

Because shared state bypasses the snapshot/mutation-batch workflow, it does not participate in transactional merge. Concurrent writes follow last-write-wins semantics. Access is async through ProfileAccess, available in PhaseContext.

StateScope – convenience key builder

Constructor	Key String	Scenario
`StateScope::global()`	`"global"`	All agents share one instance
`StateScope::parent_thread(id)`	`"parent_thread::{id}"`	Parent and child agents share within a delegation tree
`StateScope::agent_type(name)`	`"agent_type::{name}"`	All instances of an agent type share
`StateScope::thread(id)`	`"thread::{id}"`	Thread-local persistent state
`StateScope::new(s)`	`"{s}"`	Arbitrary grouping (tenant, region, etc.)

The key is a plain &str – fully extensible without code changes. StateScope is an optional convenience; any raw string works.

When to use

State shared across thread boundaries
State shared across agent boundaries
Dynamic scoping that cannot be determined at compile time
Data that serves as database-like indexed storage

Example

use awaken_contract::ProfileKey;

struct TeamContextKey;
impl ProfileKey for TeamContextKey {
    const KEY: &'static str = "team_context";
    type Value = TeamData;
}

// In a hook -- share context with child agents
let scope = StateScope::parent_thread(&ctx.run_identity.parent_thread_id.unwrap());
let mut team = access.read::<TeamContextKey>(scope.as_str()).await?;
team.goals.push("new goal".into());
access.write::<TeamContextKey>(scope.as_str(), &team).await?;

Profile State

Profile state is a persistent, async layer for per-entity preferences. Like shared state, it uses the ProfileStore backend and is accessed through ProfileAccess. The difference is the key convention: instead of a StateScope string, profile state typically uses an agent name or "system" as the key.

A ProfileKey binds a static namespace string to a value type. The key parameter identifies which agent or system entity the data belongs to.

When to use

Per-agent persistent preferences (locale, display name, custom settings)
System-level configuration shared across all agents
Data belonging to a specific agent identity rather than a dynamic group

Example

struct Locale;
impl ProfileKey for Locale {
    const KEY: &'static str = "locale";
    type Value = String;
}

let locale = access.read::<Locale>("alice").await?;
access.write::<Locale>("system", &"en-US".into()).await?;

Decision Guide

Need state during a single run?
  +-- Yes, sync + transactional --> Run State (StateKey, KeyScope::Run)
  +-- No, needs to persist
       +-- Same thread only, sync + transactional --> Thread State (StateKey, KeyScope::Thread)
       +-- Cross-boundary, dynamic key --> Shared State (ProfileKey + StateScope)
       +-- Per-agent/user preference --> Profile State (ProfileKey + agent/system key)

Comparison: Shared State vs Thread State

Both persist data across runs. The key differences:

Aspect	Thread State	Shared State
Access	Sync (snapshot)	Async (ProfileAccess)
Scope	Fixed to current thread	Dynamic (any string)
Merge safety	MutationBatch + strategy	Last-write-wins
Cross-boundary	No	Yes
Lifecycle	Auto export/restore	Always persistent

Use Thread State when you need sync access and transactional guarantees within a run. Use Shared State when you need cross-boundary sharing or dynamic scoping.

State and Snapshot Model

Awaken uses a typed state engine with snapshot isolation. This page explains the state primitives, scoping rules, merge strategies, and the mutation lifecycle.

StateKey Trait

Every piece of runtime state is declared as a type implementing StateKey:

pub trait StateKey: 'static + Send + Sync {
    const KEY: &'static str;
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;
    const SCOPE: KeyScope = KeyScope::Run;

    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
    type Update: Send + 'static;

    fn apply(value: &mut Self::Value, update: Self::Update);
}

A StateKey is a zero-sized type that associates a string key, a value type, an update type, and a merge function. The apply method defines how an update transforms the current value. Serialization is handled through encode/decode with JSON as the interchange format.

Plugins register their state keys through PluginRegistrar::register_key::<K>() during the Plugin::register callback.

KeyScope

pub enum KeyScope {
    Run,     // Cleared at run start (default)
    Thread,  // Persists across runs on the same thread
}

Run-scoped keys are reset to their default value when a new run begins. Use this for per-run counters, step state, and transient execution metadata.

Thread-scoped keys survive across runs on the same thread. Use this for conversation memory, accumulated context, and persistent agent preferences.

MergeStrategy

pub enum MergeStrategy {
    Exclusive,    // Conflict on concurrent writes (default)
    Commutative,  // Order-independent updates, safe to merge
}

When multiple hooks run in the same phase and produce MutationBatch values that touch the same key:

Exclusive – the batches cannot be merged. The runtime detects the conflict and falls back to sequential execution. This is the safe default for keys where write order matters.
Commutative – the update operations can be applied in any order and produce the same result. The runtime concatenates the operations from both batches. Use this for append-only logs, counters, and set unions.

Snapshot

A Snapshot is an immutable view of the state at a point in time. Phase hooks receive a snapshot reference and can read any registered key’s value. They cannot mutate the snapshot directly.

Phase hook receives: &Snapshot
Phase hook writes to: MutationBatch

This separation guarantees that hooks within the same phase see identical state regardless of execution order.

MutationBatch

A MutationBatch collects state updates produced by a single hook invocation:

pub struct MutationBatch {
    base_revision: Option<u64>,
    ops: Vec<Box<dyn MutationOp>>,
    touched_keys: Vec<String>,
}

Each operation in ops is a type-erased KeyPatch<K> that carries the K::Update value. When the batch is applied, each operation calls K::apply(value, update) on the target key in the state map.

The touched_keys list enables conflict detection for Exclusive keys during parallel merge.

Mutation Lifecycle

1. Phase starts
2. Runtime takes a Snapshot (immutable clone)
3. Each hook reads from the Snapshot, produces a MutationBatch
4. All hooks complete (phase convergence)
5. Runtime checks MutationBatch key overlap:
   - Exclusive keys overlap -> conflict (sequential fallback)
   - Commutative keys overlap -> ops concatenated
   - No overlap -> batches merged
6. Merged batch applied atomically to the live state
7. New Snapshot taken for the next phase

StateMap

StateMap is the runtime’s typed state container. It uses typedmap::TypedMap internally, keyed by zero-sized TypedKey<K> wrappers that hash and compare by TypeId. The get::<K>() and get_or_insert_default::<K>() methods retrieve the concrete K::Value type directly without downcasting.

State keys registered with persistent: true in StateKeyOptions are serialized during checkpoint and restored on thread load. Non-persistent keys exist only in memory for the duration of the run.

StateStore

StateStore wraps the StateMap and provides:

Snapshot creation (cheap Arc clone of the inner map)
Batch application with revision tracking
Commit hooks (CommitHook) that fire after each successful state mutation
StateCommand processing for programmatic state operations

Shared State

For state that must be shared across thread and agent boundaries, awaken provides a separate persistent layer built on the same ProfileStore backend used by profile data.

ProfileKey for shared state

Shared state uses the same ProfileKey trait as profile data. The KEY constant serves as the namespace, and the key: &str parameter passed to ProfileAccess methods identifies the runtime instance:

pub trait ProfileKey: 'static + Send + Sync {
    const KEY: &'static str;
    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
}

Unlike StateKey, shared state does not participate in the MutationBatch / snapshot workflow – it is accessed asynchronously through ProfileAccess with a key: &str parameter.

StateScope

pub struct StateScope(String);

An optional convenience builder for common key string patterns. Constructors: global(), parent_thread(id), agent_type(name), thread(id), new(arbitrary). Call .as_str() to get the key string for ProfileAccess methods. Users can also pass any raw &str directly.

State Management Overview

Layer	Scope	Access	Lifecycle	Use Case
Run State	Current run	Sync snapshot	Cleared at run start	Transient flags, counters
Thread State	Same thread	Sync snapshot	Export/restore across runs	Tool call state, active agent
Shared State	`ProfileKey` + `StateScope`	Async	Persistent	Cross-boundary sharing
Profile State	`ProfileKey` + `key: &str`	Async	Persistent	User/agent preferences

Choose shared state when you need dynamic scoping or cross-boundary access. Choose StateKey when you need sync access or transactional merge during parallel tool execution.

Run Lifecycle and Phases

This page describes the state machines that govern run execution and tool call processing, the phase enum, termination conditions, and checkpoint triggers.

RunStatus

A run’s coarse lifecycle is captured by RunStatus:

Running --+--> Waiting --+--> Running (resume)
          |              |
          +--> Done      +--> Done

pub enum RunStatus {
    Running,  // Actively executing (default)
    Waiting,  // Paused, waiting for external decisions
    Done,     // Terminal -- cannot transition further
}

Running -> Waiting: a tool call suspends, the run pauses for external input.
Waiting -> Running: decisions arrive, the run resumes.
Running -> Done or Waiting -> Done: terminal transition on completion, cancellation, or error.
Done -> *: not allowed. Terminal state.

ToolCallStatus

Each tool call in a run has its own lifecycle:

New --> Running --+--> Succeeded (terminal)
                  +--> Failed (terminal)
                  +--> Cancelled (terminal)
                  +--> Suspended --> Resuming --+--> Running
                                                +--> Suspended (re-suspend)
                                                +--> Succeeded/Failed/Cancelled

pub enum ToolCallStatus {
    New,        // Created, not yet executing
    Running,    // Currently executing
    Suspended,  // Waiting for external decision
    Resuming,   // Decision received, about to re-execute
    Succeeded,  // Completed successfully (terminal)
    Failed,     // Completed with error (terminal)
    Cancelled,  // Cancelled externally (terminal)
}

Key transitions:

Suspended can only move to Resuming or Cancelled – it cannot jump directly to Running or a success/failure state.
Resuming has wide transitions: it can re-enter Running, re-suspend, or reach any terminal state.
Terminal states (Succeeded, Failed, Cancelled) cannot transition to any non-self state.

Phase Enum

The Phase enum defines the eight execution phases in order:

pub enum Phase {
    RunStart,
    StepStart,
    BeforeInference,
    AfterInference,
    BeforeToolExecute,
    AfterToolExecute,
    StepEnd,
    RunEnd,
}

RunStart – fires once at the beginning of a run. Plugins initialize run-scoped state.

StepStart – fires at the beginning of each inference round. Step counter increments.

BeforeInference – last chance to modify the inference request (system prompt, tools, parameters). Plugins can skip inference by setting a behavior flag.

AfterInference – fires after the LLM response arrives. Plugins can inspect the response, modify tool call lists, or request termination.

BeforeToolExecute – fires before each tool call batch. Permission checks, interception, and suspension happen here.

AfterToolExecute – fires after tool results are available. Plugins can inspect results and trigger side effects.

StepEnd – fires at the end of each inference round. Checkpoint persistence happens here. Stop conditions (max rounds, token budget, loop detection) are evaluated.

RunEnd – fires once when the run terminates, regardless of reason. Cleanup and final state persistence.

TerminationReason

When a run ends, the TerminationReason records why:

pub enum TerminationReason {
    NaturalEnd,           // LLM returned no tool calls
    BehaviorRequested,    // A plugin requested inference skip
    Stopped(StoppedReason), // A stop condition fired (code + optional detail)
    Cancelled,            // External cancellation signal
    Blocked(String),      // Permission checker blocked the run
    Suspended,            // Waiting for external tool-call resolution
    Error(String),        // Error path
}

TerminationReason::to_run_status() maps each variant to the appropriate RunStatus:

Suspended maps to RunStatus::Waiting (the run can resume).
All other variants map to RunStatus::Done.

Stop Conditions

Declarative stop conditions are configured per agent via StopConditionSpec:

Variant	Trigger
`MaxRounds { rounds }`	Step count exceeds limit
`Timeout { seconds }`	Wall-clock time exceeds limit
`TokenBudget { max_total }`	Cumulative token usage exceeds budget
`ConsecutiveErrors { max }`	Sequential tool errors exceed threshold
`StopOnTool { tool_name }`	A specific tool is called
`ContentMatch { pattern }`	LLM output matches a regex pattern
`LoopDetection { window }`	Repeated identical tool calls within a sliding window

Stop conditions are evaluated at StepEnd. When one fires, the run terminates with TerminationReason::Stopped.

Checkpoint Triggers

State is persisted at StepEnd after each inference round. The checkpoint includes:

Thread messages (append-only)
Run lifecycle state (RunStatus, step count, termination reason)
Persistent state keys (those registered with persistent: true)
Tool call states for suspended calls

Checkpoints enable resume from the last completed step after a crash or intentional suspension.

RunStatus Derived from ToolCall States

A run’s status is a projection of all its tool call states. Each tool call has an independent lifecycle; the run status is the aggregate:

fn derive_run_status(calls: &HashMap<String, ToolCallState>) -> RunStatus {
    let mut has_suspended = false;
    for state in calls.values() {
        match state.status {
            // Any Running or Resuming call → Run is still executing
            ToolCallStatus::Running | ToolCallStatus::Resuming => {
                return RunStatus::Running;
            }
            ToolCallStatus::Suspended => {
                has_suspended = true;
            }
            // Succeeded / Failed / Cancelled are terminal — keep checking
            _ => {}
        }
    }
    if has_suspended {
        RunStatus::Waiting   // No executing calls, but some await decisions
    } else {
        RunStatus::Done      // All calls in terminal state → step complete
    }
}

Decision table:

Any `Running`/`Resuming`?	Any `Suspended`?	Run Status	Meaning
Yes	—	Running	Tools are actively executing
No	Yes	Waiting	All execution done, awaiting external decisions
No	No	Done	All calls terminal → proceed to next step

Parallel tool call state timeline

When an LLM returns multiple tool calls (e.g. [tool_A, tool_B, tool_C]), their states evolve independently:

Time  tool_A(approval-req)  tool_B(approval-req)  tool_C(normal)   → Run Status
────────────────────────────────────────────────────────────────
t0    Created        Created        Created        Running     Step starts
t1    Suspended      Created        Running        Running     tool_A intercepted
t2    Suspended      Suspended      Running        Running     tool_B intercepted, tool_C executing
t3    Suspended      Suspended      Succeeded      Waiting     tool_C done, no Running calls
t4    Resuming       Suspended      Succeeded      Running     tool_A decision arrives
t5    Succeeded      Suspended      Succeeded      Waiting     tool_A replay done
t6    Succeeded      Resuming       Succeeded      Running     tool_B decision arrives
t7    Succeeded      Succeeded      Succeeded      Done        All terminal → next step

At every transition the run status is re-derived from the aggregate of all call states. This means a single decision arriving does not end the wait — the run stays in Waiting until all suspended calls are resolved.

Suspension Bridges Run and Tool-Call Layers

Current execution model (serial phases)

Tool execution is split into two serial phases inside execute_tools_with_interception:

Phase 1 — Intercept (serial, per-call):
  for each call:
    BeforeToolExecute hooks → check for intercept actions
    Suspend?  → mark Suspended, set suspended=true, continue
    Block?    → mark Failed, return immediately
    SetResult → mark with provided result, continue
    None      → add to allowed_calls

Phase 2 — Execute (allowed_calls only):
  Sequential mode: one by one, break on first suspension
  Parallel mode:   batch execute, collect all results

After both phases, if suspended == true, the step returns StepOutcome::Suspended. The orchestrator then:

Persists checkpoint (messages, tool call states)
Emits RunFinish(Suspended) to protocol encoders
Enters wait_for_resume_or_cancel loop

wait_for_resume_or_cancel loop

loop {
    let decisions = decision_rx.next().await;  // block until decisions arrive
    emit_decision_events_and_messages(decisions);
    prepare_resume(decisions);       // Suspended → Resuming
    detect_and_replay_resume();      // re-execute Resuming calls
    if !has_suspended_calls() {
        return WaitOutcome::Resumed; // all resolved → exit wait
    }
    // Some calls still Suspended → continue waiting
}

Key properties:

The loop handles partial resume: if only tool_A’s decision arrives but tool_B is still suspended, tool_A is replayed immediately and the loop continues waiting for tool_B.
Decisions can arrive in batches or one at a time.
On WaitOutcome::Resumed, the orchestrator re-enters the step loop for the next LLM inference round.

Resume replay

detect_and_replay_resume scans ToolCallStates for calls with status == Resuming and re-executes them through the standard tool pipeline. The arguments field already reflects the resume mode (set by prepare_resume):

Resume Mode	Arguments on Replay	Behavior
`ReplayToolCall`	Original arguments	Full re-execution
`UseDecisionAsToolResult`	Decision result	`FrontendToolPlugin` intercepts in `BeforeToolExecute`, returns `SetResult`
`PassDecisionToTool`	Decision result	Tool receives decision as arguments

Already-completed calls (Succeeded, Failed, Cancelled) are skipped.

Limitation: decisions during execution

In the current serial model, decisions that arrive while Phase 2 tools are still executing sit in the channel buffer. They are only consumed when the step finishes and the orchestrator enters wait_for_resume_or_cancel.

This means:

tool_A’s approval arrives at t2 (while tool_C is executing)
tool_A is not replayed until t3 (after tool_C finishes)
The delay equals the remaining execution time of Phase 2 tools

Concurrent Execution Model (future)

The ideal model executes suspended-tool waits and allowed-tool execution in parallel, so a decision for tool_A can trigger immediate replay even while tool_C is still running.

Architecture

Phase 1 — Intercept (same as current)

Phase 2 — Concurrent execution:
  ┌─ task: execute(tool_C) ──────────────────────────┐
  │                                                   │
  ├─ task: execute(tool_D) ────────────┐              │
  │                                    │              │
  ├─ task: wait_decision(tool_A) → replay(tool_A) ──┐│
  │                                                  ││
  ├─ task: wait_decision(tool_B) ──────────→ replay(tool_B)
  │                                                  │
  └─ barrier: all tasks reach terminal state ────────┘

Per-call decision routing

The shared decision_rx channel carries batches of decisions for multiple tool calls. A dispatcher task demuxes decisions to per-call notification channels:

struct ToolCallWaiter {
    waiters: HashMap<String, oneshot::Sender<ToolCallResume>>,
}

impl ToolCallWaiter {
    async fn dispatch_loop(&mut self, decision_rx: &mut UnboundedReceiver<DecisionBatch>) {
        while let Some(batch) = decision_rx.next().await {
            for (call_id, resume) in batch {
                if let Some(tx) = self.waiters.remove(&call_id) {
                    let _ = tx.send(resume);
                }
            }
            if self.waiters.is_empty() { break; }
        }
    }
}

Each suspended tool call gets a oneshot::Receiver. When its decision arrives, the receiver wakes the task, which runs the replay immediately — concurrently with any still-executing allowed tools.

State transition timing

With the concurrent model, state transitions happen as events occur rather than in batches:

t0: tool_C starts executing          → RunStatus: Running
t1: tool_A decision arrives, replay  → RunStatus: Running (tool_A Resuming)
t2: tool_A replay completes          → RunStatus: Running (tool_C still Running)
t3: tool_C completes                 → RunStatus: Waiting (tool_B still Suspended)
t4: tool_B decision arrives, replay  → RunStatus: Running (tool_B Resuming)
t5: tool_B replay completes          → RunStatus: Done (all terminal)

No artificial delay — each tool call progresses as fast as its external dependency allows.

Protocol Adapter: SSE Reconnection

A backend run may span multiple frontend SSE connections. This is especially relevant for the AI SDK v6 protocol, where each HTTP request corresponds to one “turn” and produces one SSE stream.

Problem

Turn 1 (user message):
  HTTP POST → SSE stream 1 → events flow → tool suspends
  → RunFinish(Suspended) → finish event → SSE stream 1 closes

  The run is still alive in wait_for_resume_or_cancel.
  But the event_tx channel from SSE stream 1 has been dropped.

Turn 2 (tool output / approval):
  HTTP POST → SSE stream 2 → decision delivered to orchestrator
  → orchestrator resumes → emits events
  → events go to... the dropped event_tx? Lost.

Solution: ReconnectableEventSink

Replace the fixed ChannelEventSink with a reconnectable wrapper that allows swapping the underlying channel sender:

struct ReconnectableEventSink {
    inner: Arc<tokio::sync::Mutex<mpsc::UnboundedSender<AgentEvent>>>,
}

impl ReconnectableEventSink {
    fn new(tx: mpsc::UnboundedSender<AgentEvent>) -> Self {
        Self { inner: Arc::new(tokio::sync::Mutex::new(tx)) }
    }

    /// Replace the underlying channel. Called when a new SSE connection
    /// arrives for an existing suspended run.
    async fn reconnect(&self, new_tx: mpsc::UnboundedSender<AgentEvent>) {
        *self.inner.lock().await = new_tx;
    }
}

#[async_trait]
impl EventSink for ReconnectableEventSink {
    async fn emit(&self, event: AgentEvent) {
        let _ = self.inner.lock().await.send(event);
    }
}

Reconnection flow

Turn 1:
  submit() → create (event_tx1, event_rx1)
           → ReconnectableEventSink(event_tx1)
           → spawn_execution (run starts)
  events → event_tx1 → event_rx1 → SSE stream 1
  tool suspends → finish(tool-calls) → SSE stream 1 closes
  event_tx1 still held by ReconnectableEventSink (sends fail silently)
  run alive in wait_for_resume_or_cancel

Turn 2:
  new HTTP request with decisions arrives
  create (event_tx2, event_rx2)
  sink.reconnect(event_tx2)          ← swap channel
  send_decision → decision channel → orchestrator resumes
  events → ReconnectableEventSink → event_tx2 → event_rx2 → SSE stream 2
  run completes → RunFinish(NaturalEnd) → SSE stream 2 closes

No events are lost between SSE connections because:

During suspend, the orchestrator is blocked in wait_for_resume_or_cancel and emits no events.
reconnect() completes before send_decision(), so the first resume event (RunStart) goes to the new channel.

Protocol-specific behavior

Protocol	Suspend Signal	Resume Mechanism
AI SDK v6	`finish(finishReason: "tool-calls")`	New HTTP POST → reconnect → send_decision
AG-UI	`RUN_FINISHED(outcome: "interrupt")`	New HTTP POST → reconnect → send_decision
CopilotKit	`renderAndWaitForResponse` UI	Same SSE or new request via AG-UI

HITL and Mailbox

This page explains how Awaken handles human-in-the-loop (HITL) interactions through tool call suspension and the mailbox queue that manages run requests.

SuspendTicket

When a tool call needs external approval or input, it produces a SuspendTicket:

pub struct SuspendTicket {
    pub suspension: Suspension,
    pub pending: PendingToolCall,
    pub resume_mode: ToolCallResumeMode,
}

suspension – the external-facing payload describing what input is needed:

pub struct Suspension {
    pub id: String,             // Unique suspension ID
    pub action: String,         // Action identifier (e.g., "confirm", "approve")
    pub message: String,        // Human-readable prompt
    pub parameters: Value,      // Action-specific parameters
    pub response_schema: Option<Value>,  // JSON Schema for expected response
}

pending – the tool call projection emitted to the event stream so the frontend knows which call is waiting:

pub struct PendingToolCall {
    pub id: String,        // Tool call ID
    pub name: String,      // Tool name
    pub arguments: Value,  // Original arguments
}

resume_mode – how the agent loop should handle the decision when it arrives.

ToolCallResumeMode

pub enum ToolCallResumeMode {
    ReplayToolCall,          // Re-execute the original tool call
    UseDecisionAsToolResult, // Use the decision payload as the tool result directly
    PassDecisionToTool,      // Pass the decision payload into the tool as new arguments
}

ReplayToolCall is the default. The original tool call is re-executed after the decision arrives. Use this when the decision unlocks execution (e.g., permission granted, now run the tool).

UseDecisionAsToolResult skips re-execution entirely. The external decision payload becomes the tool result. Use this when a human provides the answer directly (e.g., “the correct value is X”).

PassDecisionToTool re-executes the tool but injects the decision payload into the arguments. Use this when the decision modifies how the tool should run (e.g., “use this alternative path instead”).

ResumeDecisionAction

pub enum ResumeDecisionAction {
    Resume,  // Proceed with the tool call
    Cancel,  // Cancel the tool call
}

Each decision carries an action. Resume continues execution according to the ToolCallResumeMode. Cancel transitions the tool call to ToolCallStatus::Cancelled.

ToolCallResume

The full resume payload:

pub struct ToolCallResume {
    pub decision_id: String,         // Idempotency key
    pub action: ResumeDecisionAction,
    pub result: Value,               // Decision payload
    pub reason: Option<String>,      // Human-readable reason
    pub updated_at: u64,             // Unix millis timestamp
}

Permission Plugin and Ask-Mode

The awaken-ext-permission plugin uses suspension to implement ask-mode approvals:

A tool call matches a permission rule with behavior: ask.
The permission checker creates a SuspendTicket with ToolCallResumeMode::ReplayToolCall.
The suspension payload describes the tool and its arguments.
The tool call transitions to ToolCallStatus::Suspended.
The run transitions to RunStatus::Waiting.
A frontend presents the approval prompt to the user.
The user submits a ToolCallResume with ResumeDecisionAction::Resume or Cancel.
On Resume, the tool call replays and executes normally.
On Cancel, the tool call is cancelled and the run may continue with remaining calls.

Mailbox Architecture

The mailbox provides a persistent queue for all run requests. Every run – streaming, background, A2A, internal – enters the system as a MailboxJob.

MailboxJob

pub struct MailboxJob {
    // Identity
    pub job_id: String,          // UUID v7
    pub mailbox_id: String,      // Thread ID (routing anchor)

    // Request payload
    pub agent_id: String,
    pub messages: Vec<Message>,
    pub origin: MailboxJobOrigin,
    pub sender_id: Option<String>,
    pub parent_run_id: Option<String>,
    pub request_extras: Option<Value>,

    // Queue semantics
    pub priority: u8,            // 0 = highest, 255 = lowest, default 128
    pub dedupe_key: Option<String>,
    pub generation: u64,

    // Lifecycle
    pub status: MailboxJobStatus,
    pub available_at: u64,
    pub attempt_count: u32,
    pub max_attempts: u32,
    pub last_error: Option<String>,

    // Lease
    pub claim_token: Option<String>,
    pub claimed_by: Option<String>,
    pub lease_until: Option<u64>,

    // Timestamps
    pub created_at: u64,
    pub updated_at: u64,
}

MailboxJobStatus

Queued --claim--> Claimed --ack--> Accepted (terminal)
  |                  |
  |               nack(retry) --> Queued (attempt_count++, available_at = retry_at)
  |                  |
  |               nack(permanent) --> DeadLetter (terminal)
  |
  |-- cancel --> Cancelled (terminal)
  +-- interrupt(generation bump) --> Superseded (terminal)

pub enum MailboxJobStatus {
    Queued,      // Waiting to be claimed
    Claimed,     // Claimed by a consumer, executing
    Accepted,    // Successfully processed (terminal)
    Cancelled,   // Cancelled by caller (terminal)
    Superseded,  // Replaced by a newer generation (terminal)
    DeadLetter,  // Permanently failed (terminal)
}

MailboxJobOrigin

pub enum MailboxJobOrigin {
    User,      // HTTP API, SDK
    A2A,       // Agent-to-Agent protocol
    Internal,  // Child run notification, handoff
}

MailboxStore Trait

MailboxStore defines the persistent queue interface:

enqueue – persist a job, auto-assign generation, reject duplicate dedupe_key
claim – atomically claim up to N Queued jobs for a mailbox (lease-based)
claim_job – claim a specific job by ID (for inline streaming)
ack – mark a job as Accepted (validates claim token)
nack – return a job to Queued for retry, or DeadLetter if max attempts exceeded
cancel – cancel a Queued job
extend_lease – heartbeat to extend an active claim
interrupt – atomically bump generation, supersede stale Queued jobs, return the active Claimed job for cancellation

Implementations must guarantee: durable enqueue, atomic claim (exactly one winner), claim token validation on ack/nack, and atomic interrupt with generation bump.

MailboxInterrupt

When a new high-priority request arrives for a thread that already has queued or running work:

pub struct MailboxInterrupt {
    pub new_generation: u64,       // Generation after bump
    pub active_job: Option<MailboxJob>,  // Currently running job to cancel
    pub superseded_count: usize,   // Queued jobs superseded
}

The caller cancels the active_job’s runtime run if present, ensuring the new request takes priority.

Multi-Agent Patterns

Awaken supports multiple patterns for composing agents. This page describes delegation, remote agents, sub-agent execution, and handoff.

Agent Delegation via AgentSpec.delegates

An agent can declare sub-agents it is allowed to delegate to:

{
  "id": "orchestrator",
  "model": "gpt-4o",
  "system_prompt": "You coordinate tasks across specialized agents.",
  "delegates": ["researcher", "writer", "reviewer"]
}

Each ID in delegates must be a registered agent in the AgentSpecRegistry. During resolution, the runtime creates an AgentTool for each delegate. From the LLM’s perspective, each sub-agent appears as a regular tool named agent_run_{delegate_id}.

When the LLM calls a delegate tool, the AgentTool dispatches to the appropriate backend:

Local agents (no endpoint field) use LocalBackend, which resolves and executes the sub-agent inline within the same runtime.
Remote agents (with endpoint field) use A2aBackend, which sends an A2A message:send request and polls the resulting task for completion.

Remote Agents via A2A

Remote agents are declared with an endpoint in AgentSpec:

{
  "id": "remote-analyst",
  "model": "unused-for-remote",
  "system_prompt": "",
  "endpoint": {
    "backend": "a2a",
    "base_url": "https://analyst.example.com/v1/a2a",
    "auth": { "type": "bearer", "token": "token-abc" },
    "target": "analyst",
    "timeout_ms": 300000,
    "options": {
      "poll_interval_ms": 1000
    }
  }
}

The A2aBackend handles the A2A protocol lifecycle:

Sends a message:send request with the user message.
Receives a task wrapper, extracts task.id, and polls /tasks/:task_id at the configured interval.
Returns the completed response as a DelegateRunResult.
The result is formatted as a ToolResult and returned to the parent agent’s LLM context.

If the remote agent times out or fails, the DelegateRunStatus reflects the failure and the parent agent receives an error tool result.

Sub-Agent Patterns

Sequential Delegation

The orchestrator calls sub-agents one at a time, using each result to decide the next step:

Orchestrator -> researcher (tool call) -> result
             -> writer (tool call, using researcher output) -> result
             -> reviewer (tool call, using writer output) -> result

Each delegation is a tool call within the orchestrator’s step loop. The orchestrator sees tool results and decides whether to delegate further or respond directly.

Parallel Delegation

When the LLM emits multiple tool calls in a single inference response, sub-agent delegations can execute concurrently. The runtime processes tool calls in parallel by default – if the LLM calls both agent_run_researcher and agent_run_writer in the same response, both execute simultaneously.

Nested Delegation

Sub-agents can themselves have delegates, creating hierarchies:

orchestrator
  -> team_lead (delegates: [dev_a, dev_b])
       -> dev_a
       -> dev_b

Each level resolves independently through the AgentResolver. There is no hard depth limit, but each level adds latency and token cost.

Agent Handoff

Handoff transfers control from one agent to another mid-run without stopping the loop. The mechanism:

A plugin (or the handoff extension) writes a new agent ID to the ActiveAgentKey state key.
At the next step boundary, the loop runner detects the changed key.
The loop re-resolves the agent from the AgentResolver – new config, new model, new tools, new system prompt.
Execution continues in the same run with the new agent’s configuration.

Handoff is a re-resolve, not a loop restart. Thread history is preserved. The new agent sees all prior messages and can continue the conversation seamlessly.

Handoff vs Delegation

Aspect	Delegation	Handoff
Control flow	Parent calls sub-agent as tool, gets result back	Control transfers entirely to new agent
Thread continuity	Sub-agent may use a separate thread context	Same thread, same message history
Return path	Result flows back to parent LLM	No return – new agent owns the run
Use case	Task decomposition, specialized subtasks	Role switching, escalation, routing

AgentBackend Trait

Both local and remote delegation use the AgentBackend trait:

pub trait AgentBackend: Send + Sync {
    async fn execute(
        &self,
        agent_id: &str,
        messages: Vec<Message>,
        event_sink: Arc<dyn EventSink>,
        parent_run_id: Option<String>,
        parent_tool_call_id: Option<String>,
    ) -> Result<DelegateRunResult, AgentBackendError>;
}

DelegateRunResult carries the agent ID, terminal status, optional response text, and step count. DelegateRunStatus variants: Completed, Failed(String), Cancelled, Timeout.

This trait is the extension point for custom delegation backends beyond local and A2A.

Tool and Plugin Boundary

Awaken separates user-facing capabilities (tools) from system-level lifecycle logic (plugins). This page explains the design boundary, when to use each, and how they interact.

Tools

A tool is a capability exposed to the LLM. The LLM decides when to call it based on the tool’s descriptor.

pub trait Tool: Send + Sync {
    fn descriptor(&self) -> ToolDescriptor;
    fn validate_args(&self, args: &Value) -> Result<(), ToolError> { Ok(()) }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError>;
}

Key properties:

Registered by ID – each tool has a unique string identifier from ToolDescriptor.
LLM-visible – the descriptor (name, description, JSON Schema for parameters) is included in inference requests.
Executed in tool rounds – tools run during the BeforeToolExecute / AfterToolExecute phase window.
Isolated – a tool receives only its arguments and a ToolCallContext. It cannot directly access other tools, the plugin system, or the phase runtime.
User-defined – application code creates tools for domain-specific actions (file operations, API calls, database queries).

Plugins

A plugin is a system-level extension that hooks into the execution lifecycle. Plugins do not appear in the LLM’s tool list.

pub trait Plugin: Send + Sync + 'static {
    fn descriptor(&self) -> PluginDescriptor;
    fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError>;
    fn config_schemas(&self) -> Vec<ConfigSchema> { vec![] }
}

Key properties:

Registered by ID – each plugin has a unique identifier from PluginDescriptor.
LLM-invisible – the LLM does not know plugins exist.
Run in phases – plugin hooks fire at phase boundaries (RunStart, BeforeInference, AfterToolExecute, etc.).
System-wide access – hooks receive PhaseContext with access to the state snapshot, event sink, and run metadata.
Declarative registration – plugins declare their capabilities through PluginRegistrar.

PluginRegistrar

When Plugin::register is called, the plugin uses PluginRegistrar to declare everything it needs:

Registration Method	Purpose
`register_key::<K>()`	Declare a `StateKey` for the plugin’s state
`register_phase_hook()`	Add a hook that runs at a specific phase
`register_tool()`	Inject a tool into the runtime (plugin-provided tools)
`register_effect_handler()`	Handle named effects emitted by hooks
`register_scheduled_action()`	Handle named actions scheduled by hooks
`register_request_transform()`	Transform inference requests before they reach the LLM

Plugins can register tools. This is how delegation tools (AgentTool) and MCP tools enter the runtime – they are tools owned by plugins, not directly registered by user code. The LLM sees these tools identically to user-registered tools.

When to Use a Tool

Use a tool when:

The LLM should be able to invoke the capability by name.
The capability performs a domain-specific action (search, compute, file I/O, API call).
The capability takes structured input and returns a result the LLM can reason about.
The logic does not need access to runtime internals (state, phases, other tools).

When to Use a Plugin

Use a plugin when:

The logic should run automatically at phase boundaries without LLM involvement.
The logic needs to modify inference requests (system prompt injection, tool filtering).
The logic needs to inspect or transform tool results before the LLM sees them.
The logic manages cross-cutting concerns (permissions, observability, reminders, state synchronization).
The logic needs to register state keys, effect handlers, or request transforms.

Interaction Between Tools and Plugins

Plugins can influence tool execution without the tool knowing:

Permission plugin – intercepts tool calls at BeforeToolExecute, blocks or suspends them based on policy rules. The tool itself has no permission logic.
Observability plugin – wraps tool execution with OpenTelemetry spans. The tool does not emit traces.
Reminder plugin – injects context messages after specific tools execute. The tool does not know about reminders.
Interception pipeline – plugins can modify tool arguments, replace tool results, or skip execution entirely through the tool interception mechanism.

This separation means tools remain simple and focused on their domain logic. Cross-cutting concerns are handled uniformly by plugins, applied consistently across all tools without per-tool opt-in.

Plugin-Provided Tools vs User Tools

Aspect	User Tool	Plugin Tool
Registration	`AgentRuntimeBuilder::with_tool()`	`PluginRegistrar::register_tool()`
Lifecycle	Exists for the runtime’s lifetime	Exists while the plugin is active
Configuration	Direct construction	Derived from plugin config or agent spec
Examples	Custom business logic tools	`AgentTool` (delegation), MCP tools, skill tools

Both appear identically to the LLM. The distinction is purely about ownership and lifecycle management.

Plugin System Internals

This page covers the internal mechanics of the plugin system: how plugins are registered and activated, how hooks execute and resolve conflicts, how the phase convergence loop works, and how request transforms, effects, inference overrides, and tool intercepts behave at runtime.

For the high-level boundary between tools and plugins, see Tool and Plugin Boundary. For the phase lifecycle, see Run Lifecycle and Phases.

Plugin Registration vs Activation

When a plugin is loaded, its register() method is called with a PluginRegistrar. The plugin declares two categories of components:

Structural components are always available regardless of activation state:

State keys (register_key::<K>())
Scheduled action handlers (register_scheduled_action::<A, H>())
Effect handlers (register_effect::<E, H>())

Behavioral components are only active when the plugin passes the activation filter:

Phase hooks (register_phase_hook())
Tools (register_tool())
Request transforms (register_request_transform())

Activation is controlled by AgentSpec.active_hook_filter:

`active_hook_filter` value	Behavior
Empty (default)	All plugins’ behavioral components are active
Non-empty set	Only plugins whose ID is in the set contribute behavioral components

This separation enables infrastructure plugins (state management, action handlers, effect handlers) to exist without impacting execution flow. A logging plugin that only registers an effect handler, for example, never needs to appear in active_hook_filter – its handler fires whenever any plugin emits the corresponding effect.

The filtering happens in PhaseRuntime::filter_hooks():

fn filter_hooks<'a>(env: &'a ExecutionEnv, ctx: &PhaseContext) -> Vec<&'a TaggedPhaseHook> {
    let hooks = env.hooks_for_phase(ctx.phase);
    let active_hook_filter = &ctx.agent_spec.active_hook_filter;
    hooks
        .iter()
        .filter(|tagged| {
            active_hook_filter.is_empty()
                || active_hook_filter.contains(&tagged.plugin_id)
        })
        .collect()
}

Hook Ordering and Conflict Resolution

Multiple plugins can register hooks for the same Phase. The engine runs them through a two-stage process implemented in gather_and_commit_hooks() (crates/awaken-runtime/src/phase/engine.rs).

Fast path: parallel execution, single commit

All hooks for a phase run in parallel against a frozen snapshot. Each hook receives the same snapshot and produces a StateCommand. If no hook writes to a key with MergeStrategy::Exclusive that another hook also writes to, all commands are merged and committed in a single batch.

flowchart LR
    S[Frozen Snapshot] --> H1[Hook A]
    S --> H2[Hook B]
    S --> H3[Hook C]
    H1 --> M[Merge All]
    H2 --> M
    H3 --> M
    M --> C[Single Commit]

Conflict fallback: partition and serial retry

If two or more hooks write to the same Exclusive key, the engine detects the conflict and falls back:

Partition – Walk commands in registration order. Greedily add each command to a “compatible batch” if its Exclusive keys do not overlap with keys already in the batch. Otherwise, defer the hook.
Commit the batch – The compatible batch is merged and committed.
Serial re-execution – Deferred hooks are re-run one at a time, each against a fresh snapshot that includes the results of all prior commits.

flowchart TD
    PAR[Run all hooks in parallel] --> CHECK{Exclusive key overlap?}
    CHECK -- No --> FAST[Merge all, commit once]
    CHECK -- Yes --> PART[Partition: batch + deferred]
    PART --> CBATCH[Commit compatible batch]
    CBATCH --> SERIAL[Re-run deferred hooks serially]
    SERIAL --> DONE[All hooks committed]

Because hooks are pure functions (frozen snapshot in, StateCommand out, no side effects), re-execution on conflict is always safe. The deferred hooks see the updated state from the batch commit, so they produce correct results even when the original parallel execution would have conflicted.

See State and Snapshot Model for details on MergeStrategy and snapshot isolation.

Phase Convergence Loop

Each phase runs a GATHER then EXECUTE loop that converges when no new work remains.

GATHER stage

Run all active hooks in parallel (with conflict resolution as described above). Hooks produce StateCommand values that may contain:

State mutations (key updates)
Scheduled actions (to be processed in the EXECUTE stage)
Effects (dispatched immediately after commit)

EXECUTE stage

Process pending scheduled actions whose Phase matches the current phase and whose action key has a registered handler. Each action handler runs against a fresh snapshot and produces its own StateCommand, which may schedule new actions for the same phase.

If new matching actions appear after processing, the loop repeats:

flowchart TD
    START[Phase Entry] --> GATHER[GATHER: run hooks, commit]
    GATHER --> EXEC[EXECUTE: process matching actions]
    EXEC --> CHECK{New actions for this phase?}
    CHECK -- Yes --> EXEC
    CHECK -- No --> DONE[Phase complete]

The loop is bounded by DEFAULT_MAX_PHASE_ROUNDS (16). If the action count does not converge within this limit, the engine returns StateError::PhaseRunLoopExceeded. This prevents infinite reactive chains while allowing legitimate multi-step action cascades.

This convergence design enables reactive patterns: a permission check action can schedule a suspend action in the same BeforeToolExecute phase, and both are processed before the phase completes.

Request Transform Hooks

Plugins can register InferenceRequestTransform implementations via registrar.register_request_transform(). Transforms modify the InferenceRequest before it reaches the LLM executor.

Use cases:

System prompt injection – append context, instructions, or reminders to the system message
Tool list modification – filter, reorder, or augment the tool descriptors sent to the LLM
Parameter overrides – adjust temperature, max tokens, or other inference parameters

Transforms run in registration order and are composable: each transform receives the request as modified by the previous transform.

Only active plugins’ transforms are applied. If active_hook_filter is non-empty, transforms from plugins not in the filter are skipped.

Effect Handlers

Effects are typed events defined via the EffectSpec trait:

pub trait EffectSpec: 'static + Send + Sync {
    const KEY: &'static str;
    type Payload: Serialize + DeserializeOwned + Send + Sync + 'static;
}

Hooks and action handlers emit effects by calling command.emit::<E>(payload) on their StateCommand. Unlike scheduled actions (which execute within a specific phase’s convergence loop), effects are dispatched after the command is committed to the store.

Plugins register effect handlers via registrar.register_effect::<E, H>(handler). When an effect is dispatched, the engine calls the handler with the effect payload and the current snapshot.

Key properties:

Fire-and-forget – handler failures are logged but do not block execution or roll back the commit.
Post-commit – effects see the state after the command that emitted them has been applied.
Validated at submit time – if a command emits an effect with no registered handler, submit_command returns StateError::UnknownEffectHandler immediately, preventing silent drops.

Use cases: audit logging, metric emission, cross-plugin notification, external system synchronization.

InferenceOverride Merging

Multiple plugins can influence inference parameters by scheduling SetInferenceOverride actions. The InferenceOverride struct uses last-wins-per-field merge semantics:

pub struct InferenceOverride {
    pub model: Option<String>,
    pub fallback_models: Option<Vec<String>>,
    pub temperature: Option<f64>,
    pub max_tokens: Option<u32>,
    pub top_p: Option<f64>,
    pub reasoning_effort: Option<ReasoningEffort>,
}

When two overrides are merged, each field independently takes the last non-None value:

pub fn merge(&mut self, other: InferenceOverride) {
    if other.temperature.is_some() {
        self.temperature = other.temperature;
    }
    if other.max_tokens.is_some() {
        self.max_tokens = other.max_tokens;
    }
    // ... same for all fields
}

This allows plugins to override specific parameters without affecting others. A cost-control plugin can set max_tokens while a quality plugin sets temperature, and neither interferes with the other. If both set the same field, the last merge wins.

Tool Intercept Priority

During BeforeToolExecute, plugins can schedule ToolInterceptAction to control tool execution flow. The action payload is one of three variants:

pub enum ToolInterceptPayload {
    Block { reason: String },
    Suspend(SuspendTicket),
    SetResult(ToolResult),
}

When multiple intercepts are scheduled for the same tool call, they are resolved by implicit priority:

Priority	Variant	Behavior
3 (highest)	`Block`	Terminate tool execution, fail the call
2	`Suspend`	Pause execution, wait for external decision
1 (lowest)	`SetResult`	Short-circuit with a predefined result

The highest-priority intercept wins. If two intercepts have the same priority (e.g., two plugins both schedule Block), the first one processed takes effect and the conflict is logged as an error.

If no intercept is scheduled, the tool executes normally (implicit proceed).

flowchart TD
    BTE[BeforeToolExecute hooks run] --> INT{Any intercepts scheduled?}
    INT -- No --> EXEC[Execute tool normally]
    INT -- Yes --> PRI[Resolve by priority]
    PRI --> B[Block: fail the call]
    PRI --> S[Suspend: wait for decision]
    PRI --> R[SetResult: return predefined result]

Design Tradeoffs

This page documents key architectural decisions in Awaken and the tradeoffs they entail.

Snapshot Isolation vs Mutable State

Decision: Phase hooks read from an immutable Snapshot and write to a MutationBatch. Mutations are applied atomically after all hooks converge.

Alternative: Hooks mutate shared state directly (protected by locks or sequential execution).

	Snapshot Isolation	Mutable State
Correctness	Hooks see consistent state regardless of execution order	Result depends on hook ordering and lock granularity
Concurrency	Hooks can run in parallel without data races	Requires careful lock management or forced sequencing
Complexity	Requires `MutationBatch` machinery, conflict detection, merge strategies	Simpler implementation, direct reads and writes
Debuggability	Each phase boundary is a clean state transition	State changes interleaved, harder to trace
Cost	Extra `Arc` clone per phase for snapshot creation	No snapshot overhead

Why snapshot isolation: Hook execution order should not affect correctness. When multiple plugins touch state in the same phase, mutable state creates implicit ordering dependencies that are difficult to test and reason about. The snapshot approach makes each phase a pure function from state to mutations, which is easier to verify and replay.

Phase-Based Execution vs Event-Driven

Decision: Execution proceeds through a fixed sequence of phases (RunStart through RunEnd). Plugins register hooks at specific phases.

Alternative: Fully event-driven architecture where plugins subscribe to events and react asynchronously.

	Phase-Based	Event-Driven
Predictability	Deterministic execution order within each phase	Non-deterministic ordering, race conditions possible
Plugin composition	Plugins interact at well-defined boundaries	Plugins interact through shared event bus, implicit coupling
Testability	Phase sequences are easy to unit test	Requires simulating async event flows
Flexibility	Adding behavior between phases requires new phases	New events can be added freely
Performance	Sequential phase execution adds overhead	Concurrent processing possible

Why phase-based: Agent execution has a natural sequential structure (infer, execute tools, check termination). Phases formalize this structure and give plugins guaranteed execution points. Event-driven systems are more flexible but make it harder to reason about the order in which plugins see state and harder to implement features like “modify the inference request before it’s sent.”

Typed State Keys vs Dynamic State

Decision: State keys are Rust types implementing the StateKey trait with associated Value and Update types.

Alternative: Untyped key-value store (HashMap<String, Value>).

	Typed Keys	Dynamic State
Type safety	Compile-time guarantees on value and update types	Runtime type errors
Merge semantics	`MergeStrategy` declared per key at compile time	Merge logic must be external or convention-based
Discoverability	Keys are types – IDE navigation, documentation	Keys are strings – grep-based discovery
Boilerplate	Each key requires a type definition	Just use a string
Extensibility	New keys require code changes and recompilation	New keys can be added dynamically at runtime

Why typed keys: State correctness is critical in an agent runtime. A mistyped key or wrong value type causes subtle bugs that surface only during execution. Typed keys catch these at compile time. The apply function makes update semantics explicit – there is no ambiguity about how a counter is incremented or how a list is appended to.

Plugin System vs Middleware Chain

Decision: Plugins register through PluginRegistrar and declare hooks, state keys, tools, and effect handlers as separate registrations.

Alternative: Middleware chain where each layer wraps the next (like tower middleware or HTTP middleware stacks).

	Plugin System	Middleware Chain
Granularity	Hooks at specific phases, tools, state keys, effects – each registered independently	Each middleware wraps the entire execution
Composition	Multiple plugins contribute hooks to the same phase	Middleware ordering determines behavior
Selective activation	`active_hook_filter` can enable/disable specific plugins per agent	Must restructure the chain to skip middleware
Complexity	More registration ceremony	Simpler mental model (wrap and delegate)
Cross-cutting concerns	Natural fit – each plugin handles one concern	Each middleware handles one concern but sees all traffic

Why plugin system: Agent execution has many extension points that don’t nest cleanly. A permission check happens at BeforeToolExecute, observability spans wrap tool execution, reminders inject messages at AfterToolExecute. These are independent concerns at different phases. A middleware chain would require each middleware to understand the full lifecycle and decide when to act. The plugin system lets each plugin declare exactly which phases it cares about.

Multi-Protocol Server vs Single Protocol

Decision: The server exposes AI SDK v6, AG-UI, A2A, and MCP over HTTP, while ACP exists as a separate stdio protocol module. Each protocol adapter translates AgentEvent into its wire format.

Alternative: Support a single canonical protocol and require clients to adapt.

	Multi-Protocol	Single Protocol
Frontend compatibility	Works with AI SDK, CopilotKit, A2A clients, and MCP HTTP clients out of the box	Requires custom adapter on the client side
Maintenance	Each protocol adapter must be kept in sync with `AgentEvent` changes	One adapter to maintain
Testing	Protocol parity tests ensure all adapters handle all events	Less test surface
Complexity	Multiple route sets, encoder types, and event mappings	One route set, one encoder
Runtime coupling	Runtime is protocol-independent – only emits `AgentEvent`	Runtime could be coupled to the protocol

Why multi-protocol: The AI agent ecosystem has not converged on a single protocol. AI SDK, AG-UI, and A2A serve different use cases (chat frontends, copilot UIs, agent-to-agent communication). Supporting multiple protocols at the server layer avoids forcing protocol choices on users. The Transcoder trait and stateful encoders keep the runtime decoupled – adding a new protocol means implementing one encoder, not modifying the execution engine.

Glossary

Term	中文	Description
`Thread`	会话线程	Persisted conversation + state history.
`Run`	运行	One execution attempt over a thread.
`Phase`	阶段	Named step in the execution loop (RunStart, StepStart, BeforeInference, AfterInference, BeforeToolExecute, AfterToolExecute, StepEnd, RunEnd).
`Snapshot`	快照	Immutable state snapshot (`struct Snapshot { revision: u64, ext: Arc<StateMap> }`) seen by phase hooks.
`StateKey`	状态键	Typed key with scope, merge strategy, and value type.
`MutationBatch`	变更批次	Collected state mutations applied atomically after phase convergence.
`AgentRuntime`	智能体运行时	Orchestration layer for agent resolution and run execution.
`AgentSpec`	智能体规约	Serializable agent definition with model, prompt, tools, plugins.
`AgentEvent`	智能体事件	Canonical runtime stream event.
`Plugin`	插件	System-level lifecycle hook registered via `PluginRegistrar`.
`PluginRegistrar`	插件注册器	Registration API for phase hooks, tools, state keys, handlers.
`PhaseHook`	阶段钩子	Async hook executed during a specific phase.
`PhaseContext`	阶段上下文	Immutable context passed to phase hooks.
`StateCommand`	状态命令	Result of a phase hook: mutations + scheduled actions + effects.
`Tool`	工具	User-facing capability with descriptor, validation, and execution.
`ToolDescriptor`	工具描述符	Tool metadata: id, name, description, parameters schema.
`ToolResult`	工具结果	Tool execution result: success, error, or suspended.
`ToolCallContext`	工具调用上下文	Execution context with state access, identity, and activity reporting.
`TerminationReason`	终止原因	Why a run ended (NaturalEnd, Stopped, Error, etc.).
`SuspendTicket`	挂起票据	Suspension payload with pending call and resume mode.
`MailboxJob`	邮箱任务	Durable job entry for async/HITL workflows.
`RunRequest`	运行请求	Input to start a run: messages, thread ID, agent ID.
`MergeStrategy`	合并策略	How parallel state mutations are reconciled: Exclusive (conflict = error) or Commutative (order-independent).
`KeyScope`	键作用域	Lifetime of a state key: Run (per-execution) or Thread (persisted across runs).
`StateMap`	状态映射	Type-safe heterogeneous map backing Snapshot.
`RunStatus`	运行状态	Coarse run status: Running, Waiting, Done.
`ToolCallStatus`	工具调用状态	Per-call status: New, Running, Suspended, Resuming, Succeeded, Failed, Cancelled.
`ResolvedAgent`	已解析智能体	Agent fully resolved from registries with config, tools, plugins, and executor.
`AgentResolver`	智能体解析器	Trait that resolves an agent spec ID into a ResolvedAgent.
`BuildError`	构建错误	Error from AgentRuntimeBuilder when validation or wiring fails.
`RuntimeError`	运行时错误	Error during agent loop execution.
`InferenceOverride`	推理覆盖	Per-request override for model, temperature, max_tokens, reasoning_effort.
`ContextWindowPolicy`	上下文窗口策略	Token budget and compaction rules for context management.
`StreamEvent`	流事件	Envelope wrapping AgentEvent with sequence number and timestamp.
`TokenUsage`	令牌用量	Token consumption report from LLM inference.
`ExecutionEnv`	执行环境	Assembled runtime environment holding hooks, handlers, tools, and plugins.
`CommitHook`	提交钩子	Hook invoked when state is committed to storage.

FAQ

Which LLM providers are supported?

Any provider compatible with genai. This includes OpenAI, Anthropic, DeepSeek, Google Gemini, Ollama, and others. Configure the provider via model ID string in AgentSpec or AgentConfig. The GenaiExecutor handles provider routing based on the model prefix.

How do I add a new storage backend?

Implement the ThreadRunStore trait from awaken-contract. The trait requires methods for loading and saving threads, runs, and checkpoints. See InMemoryStore and FileStore in awaken-stores for reference implementations. Pass your store to AgentRuntime::new().with_thread_run_store(store) or AgentRuntimeBuilder::new().with_thread_run_store(store).

Can I use awaken without the server?

Yes. AgentRuntime is a standalone library type. Create a runtime, build a RunRequest, and call runtime.run(request, sink) directly. The server crate (awaken-server) is an optional HTTP/SSE gateway layered on top.

How do I run multiple agents?

Two approaches:

Delegates: Define delegate agent IDs in the parent AgentSpec. The runtime handles handoff via ActiveAgentIdKey at step boundaries.
A2A protocol: Register remote agents via AgentRuntimeBuilder::with_remote_agents(). Remote agents are discovered and invoked over HTTP using the Agent-to-Agent protocol.

What is the difference between Run scope and Thread scope?

Run scope: State exists only for the duration of a single run. Cleared when the run ends. Use for transient data like step counters, token budgets, and per-run configuration.
Thread scope: State persists across runs within the same thread. Use for conversation memory, user preferences, and accumulated context.

Scope is declared when defining a StateKey.

How do I handle tool errors?

Return ToolResult::error(tool_name, message) from your tool’s execute method. The runtime writes the error result back to the conversation as a tool response message and continues the inference loop. The LLM sees the error and can retry or adjust its approach. For fatal errors that should stop the run, return a ToolError instead.

Can tools run in parallel?

Yes. Configure ToolExecutionMode in the agent spec. When set to parallel mode, the runtime executes independent tool calls concurrently. Results are collected and merged before proceeding to the next inference step.

How do I debug a run that is stuck?

Check RunStatus in state (__runtime.run_lifecycle key). If Waiting, look at __runtime.tool_call_states for pending decisions. If Running, check if max_rounds or timeout was hit. Enable observability plugin to get per-phase tracing.

How do I test without a real LLM?

Implement LlmExecutor with canned responses. See Testing Strategy for patterns.

What happens when parallel tools write to the same state key?

If the key uses MergeStrategy::Exclusive, the merge fails and hooks fall back to serial execution. Use MergeStrategy::Commutative for keys that support concurrent writes. See State and Snapshot Model.

How do request transforms work?

Plugins register InferenceRequestTransform via the registrar. Transforms modify the inference request (system prompt, tools, parameters) before it reaches the LLM. Only active plugins’ transforms apply. See Plugin Internals.

Can I write a custom storage backend?

Yes. Implement ThreadRunStore (for state persistence) and optionally MailboxStore (for HITL). See trait definitions in awaken-stores. File and PostgreSQL implementations serve as references.

How does context compaction work?

When autocompact_threshold: Option<usize> is set in ContextWindowPolicy, the CompactionPlugin monitors token usage. When the context exceeds that threshold, it finds a safe compaction boundary (where all tool call/result pairs are complete), summarizes older messages via LLM, and replaces them with a <conversation-summary> message. See Optimize Context Window.

How do I choose between AI SDK v6, AG-UI, and A2A protocols?

AI SDK v6: Best for React frontends using Vercel AI SDK. Supports text streaming, tool calls, and state snapshots.
AG-UI: Best for CopilotKit frontends. Supports generative UI components and agent collaboration.
A2A: Best for agent-to-agent communication. Used for delegate agents and inter-service orchestration.

All three run over HTTP/SSE. Choose based on your frontend framework.

Migrating from Tirea

Awaken is a ground-up rewrite of tirea, not an incremental upgrade. This guide maps tirea 0.5 concepts to their awaken equivalents for developers porting existing agents.

The tirea 0.5 source is archived on the tirea-0.5 branch.

Crate Mapping

tirea 0.5	awaken	Notes
`tirea-state`	`awaken-contract`	State engine merged into contract crate
`tirea-state-derive`	—	Removed; use `StateKey` trait directly
`tirea-contract`	`awaken-contract`	Merged types + traits into one crate
`tirea-agentos`	`awaken-runtime`	Renamed; same role (execution engine)
`tirea-store-adapters`	`awaken-stores`	Renamed
`tirea-agentos-server`	`awaken-server`	Renamed; protocols now inline
`tirea-protocol-ai-sdk-v6`	`awaken-server::protocols::ai_sdk_v6`	Merged into server crate
`tirea-protocol-ag-ui`	`awaken-server::protocols::ag_ui`	Merged into server crate
`tirea-protocol-acp`	`awaken-server::protocols::acp`	Merged into server crate
`tirea-extension-permission`	`awaken-ext-permission`	Renamed
`tirea-extension-observability`	`awaken-ext-observability`	Renamed
`tirea-extension-skills`	`awaken-ext-skills`	Renamed
`tirea-extension-mcp`	`awaken-ext-mcp`	Renamed
`tirea-extension-handoff`	`awaken-runtime::extensions::handoff`	Merged into runtime
`tirea-extension-a2ui`	`awaken-ext-generative-ui`	Renamed
`tirea` (facade)	`awaken`	Renamed

Key structural change: tirea had 16 crates; awaken has 13. Protocol crates were merged into awaken-server, the state derive macro was removed, and tirea-state + tirea-contract were unified into awaken-contract.

State Model

tirea used a derive-macro-based state system. Awaken uses a trait-based approach.

// tirea 0.5
#[derive(StateSlot)]
#[state(key = "my_plugin.counter", scope = "run")]
struct Counter(u32);

// awaken
use awaken::prelude::*;

pub struct Counter;

impl StateKey for Counter {
    const KEY: &'static str = "my_plugin.counter";
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;
    type Value = u32;
    type Update = u32;

    fn apply(value: &mut u32, update: u32) {
        *value = update;
    }
}

tirea concept	awaken equivalent
`StateSlot` derive	`StateKey` trait impl
`ScopeDomain::Run`	`KeyScope::Run` (default)
`ScopeDomain::Thread`	`KeyScope::Thread`
`ScopeDomain::Global`	Removed (use `KeyScope::Thread` + `ProfileStore`)
`state.get::<T>()`	`snapshot.get::<T>()` (via `ctx.state::<T>()` in hooks)
`state.set(value)`	`cmd.update::<T>(value)` (via `StateCommand`)
Mutable state access	Immutable snapshots + `StateCommand` returns

Key change: State is never mutated directly. Hooks read from immutable snapshots and return StateCommand with updates. The runtime applies all updates atomically after the gather phase.

Action System

tirea used typed action enums per phase. Awaken uses ScheduledActionSpec with handlers.

// tirea 0.5
BeforeInferenceAction::AddContextMessage(
    ContextMessage::system("key", "content")
)

// awaken
use awaken::prelude::*;

let mut cmd = StateCommand::new();
cmd.schedule_action::<AddContextMessage>(
    ContextMessage::system("key", "content"),
)?;

tirea action	awaken action	Notes
`BeforeInferenceAction::AddContextMessage`	`AddContextMessage`	Same semantics
`BeforeInferenceAction::SetInferenceOverride`	`SetInferenceOverride`	Same semantics
`BeforeInferenceAction::ExcludeTool`	`ExcludeTool`	Same semantics
`BeforeToolExecuteAction::Block`	`ToolInterceptAction` with `Block` payload	Unified intercept
`BeforeToolExecuteAction::Suspend`	`ToolInterceptAction` with `Suspend` payload	Unified intercept
`AfterToolExecuteAction::AddMessage`	`AddContextMessage`	Generalized

See Scheduled Actions for the full list.

Plugin Trait

// tirea 0.5
impl Extension for MyPlugin {
    fn name(&self) -> &str { "my_plugin" }
    fn register(&self, ctx: &mut ExtensionContext) -> Result<()> {
        ctx.add_hook(Phase::BeforeInference, MyHook);
        Ok(())
    }
}

// awaken
impl Plugin for MyPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor { name: "my_plugin" }
    }
    fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
        r.register_phase_hook("my_plugin", Phase::BeforeInference, MyHook)?;
        r.register_key::<MyState>(StateKeyOptions::default())?;
        Ok(())
    }
}

tirea	awaken	Notes
`Extension` trait	`Plugin` trait	Renamed
`ExtensionContext`	`PluginRegistrar`	More structured; explicit key/tool/hook registration
`ctx.add_hook()`	`r.register_phase_hook()`	Requires plugin_id parameter
—	`r.register_key::<T>()`	New: state keys must be declared at registration
—	`r.register_tool()`	New: plugin-scoped tools
—	`r.register_scheduled_action::<A, _>(handler)`	New: custom action handlers

Tool Trait

// tirea 0.5
#[async_trait]
impl TypedTool for MyTool {
    type Args = MyArgs;
    async fn call(&self, args: Self::Args, ctx: ToolContext) -> ToolOutput {
        ToolOutput::success(json!({"result": 42}))
    }
}

// awaken
#[async_trait]
impl Tool for MyTool {
    fn descriptor(&self) -> ToolDescriptor { ... }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        Ok(ToolResult::success("my_tool", json!({"result": 42})).into())
    }
}

tirea	awaken	Notes
`TypedTool` with `type Args`	`Tool` with `Value` args	No generic Args; validate in `execute()`
`ToolOutput` (direct return)	`ToolOutput` (result + optional `StateCommand`)	New: tools can schedule actions
`ToolContext`	`ToolCallContext`	Renamed; adds `state()`, `profile_access()`

Runtime Builder

// tirea 0.5
let runtime = AgentOsBuilder::new()
    .agent(agent_spec)
    .tool("search", SearchTool)
    .extension(PermissionExtension::new(rules))
    .build()?;

// awaken
let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_tool("search", Arc::new(SearchTool))
    .with_plugin("permissions", Arc::new(PermissionPlugin::new(rules)))
    .build()?;

tirea	awaken	Notes
`AgentOsBuilder`	`AgentRuntimeBuilder`	Renamed
`.agent(spec)`	`.with_agent_spec(spec)`	Consistent `with_` prefix
`.tool(id, impl)`	`.with_tool(id, Arc<dyn Tool>)`	Explicit Arc wrapping
`.extension(ext)`	`.with_plugin(id, Arc<dyn Plugin>)`	Renamed; requires ID
`.build()`	`.build()`	Same

Concepts Removed

The following tirea concepts have no awaken equivalent:

tirea concept	Reason removed
`StateSlot` derive macro	Trait impl is simpler and doesn’t require proc-macro
`Global` scope	Thread scope + `ProfileStore` covers this
`RuntimeEffect`	Replaced by `StateCommand` effects
`EffectLog` / `ScheduledActionLog`	Replaced by tracing
`ConfigStore` / `ConfigSlot`	Replaced by `AgentSpec` sections
`AgentProfile`	Merged into `AgentSpec`
`ExtensionContext` live activation	Replaced by static `plugin_ids` on `AgentSpec`

Concepts Added

awaken concept	Description
`PluginRegistrar`	Structured registration (keys, tools, hooks, actions)
`ToolOutput` with `StateCommand`	Tools can schedule actions as side-effects
`ToolInterceptAction`	Unified Block/Suspend/SetResult pipeline
`CircuitBreaker`	Per-model LLM failure protection
`Mailbox`	Durable job queue with lease-based claim
`EventReplayBuffer`	SSE reconnection with frame replay
`DeferredToolsPlugin`	Lazy tool loading with probability model
`ProfileStore`	Cross-session persistent state

Quick Checklist

Replace tirea dependency with awaken in Cargo.toml
Replace use tirea::* with use awaken::prelude::*
Convert #[derive(StateSlot)] to impl StateKey for ...
Convert Extension impls to Plugin impls with PluginRegistrar
Convert TypedTool impls to Tool impls
Replace action enum variants with cmd.schedule_action::<ActionType>(...)
Replace AgentOsBuilder with AgentRuntimeBuilder
Update store imports: tirea_store_adapters → awaken_stores
Update server imports: protocol crates → awaken_server::protocols::*

Keyboard shortcuts

Awaken