Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Awaken is a modular AI agent runtime framework built in Rust. It provides phase-based execution with snapshot isolation and deterministic replay, a typed state engine with key scoping (thread / run) and merge strategies (exclusive / commutative), a plugin lifecycle system for extensibility, and a multi-protocol server surface supporting AI SDK v6, AG-UI, A2A, and MCP over HTTP and stdio, plus ACP over stdio.

Crate Overview

CrateDescription
awaken-contractCore contracts: types, traits, state model, agent specs
awaken-runtimeExecution engine: phase loop, plugin system, agent loop, builder
awaken-serverHTTP/SSE gateway with protocol adapters
awaken-storesStorage backends: memory, file, postgres
awaken-tool-patternGlob/regex tool matching for permission and reminder rules
awaken-ext-permissionPermission plugin with allow/deny/ask policies
awaken-ext-observabilityOpenTelemetry-based LLM and tool call tracing
awaken-ext-mcpModel Context Protocol client integration
awaken-ext-skillsSkill package discovery and activation
awaken-ext-reminderDeclarative reminder rules triggered after tool execution
awaken-ext-generative-uiDeclarative UI components (A2UI protocol)
awaken-ext-deferred-toolsDeferred tool loading with probabilistic promotion
awakenFacade crate that re-exports core modules

Architecture

Application code
  registers tools / models / providers / plugins / agent specs
        |
        v
AgentRuntime
  resolves AgentSpec -> ResolvedAgent
  builds ExecutionEnv from plugins
  runs the phase loop and exposes cancel / decision control
        |
        v
Server + storage surfaces
  HTTP routes, SSE replay, mailbox, protocol adapters, thread/run persistence

Core Principle

All state access follows snapshot isolation. Phase hooks see an immutable snapshot; mutations are collected in a MutationBatch and applied atomically after convergence.

What’s in This Book

  • Get Started — build a working mental model with the smallest runnable flows
  • Build Agents — add tools, plugins, MCP, skills, reminders, handoff, and UI capabilities
  • Serve & Integrate — expose HTTP endpoints and wire AI SDK or CopilotKit frontends
  • State & Storage — choose persistence, context shaping, and state lookup patterns
  • Operate — harden runtime behavior with observability, permissions, progress reporting, and tests
  • Reference — API, protocol, config, and schema lookup pages
  • Architecture — runtime layering, phase execution, and design tradeoffs

If you are new to the repository, use this order:

  1. Start with Get Started and complete First Agent.
  2. Move to Build Agents when you are ready to add tools and plugins.
  3. Use Serve & Integrate when the runtime needs to talk to HTTP clients or frontends.
  4. Use State & Storage and Operate as you move from demos to production behavior.
  5. Keep Reference Overview and Architecture open when you need exact contracts or runtime internals.

Repository Map

These paths matter most when you move from docs into code:

PathPurpose
crates/awaken-contract/Core contracts: tools, events, state interfaces
crates/awaken-runtime/Agent runtime: execution engine, plugins, builder
crates/awaken-server/HTTP/SSE server surfaces
crates/awaken-stores/Storage backends
crates/awaken/examples/Small runtime examples
examples/src/Full-stack server examples
docs/book/src/This documentation source

Get Started

Use this path if you are new to Awaken and want a working mental model before wiring production features.

Read in order

  1. First Agent for the smallest runnable runtime.
  2. First Tool to understand tool schemas, execution, and state writes.
  3. Build an Agent when you want a reusable project baseline.
  4. Tool Trait before writing production tools.

Leave this path when

First Agent

Goal

Run one agent end-to-end and confirm you receive a complete event stream.

Prerequisites

[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"

Set one model provider key before running:

# OpenAI-compatible models (for gpt-4o-mini)
export OPENAI_API_KEY=<your-key>

# Or DeepSeek models
export DEEPSEEK_API_KEY=<your-key>

1. Create src/main.rs

use std::sync::Arc;
use serde_json::{json, Value};
use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::contract::message::Message;
use awaken::contract::event::AgentEvent;
use awaken::contract::event_sink::VecEventSink;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::AgentSpec;
use awaken::registry::ModelEntry;
use awaken::{AgentRuntimeBuilder, RunRequest};

struct EchoTool;

#[async_trait]
impl Tool for EchoTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("echo", "Echo", "Echo input back to the caller")
            .with_parameters(json!({
                "type": "object",
                "properties": { "text": { "type": "string" } },
                "required": ["text"]
            }))
    }

    async fn execute(
        &self,
        args: Value,
        _ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        let text = args["text"].as_str().unwrap_or_default();
        Ok(ToolResult::success("echo", json!({ "echoed": text })).into())
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let agent_spec = AgentSpec::new("assistant")
        .with_model("gpt-4o-mini")
        .with_system_prompt("You are a helpful assistant. Use the echo tool when asked.")
        .with_max_rounds(5);

    let runtime = AgentRuntimeBuilder::new()
        .with_agent_spec(agent_spec)
        .with_tool("echo", Arc::new(EchoTool))
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model("gpt-4o-mini", ModelEntry {
            provider: "openai".into(),
            model_name: "gpt-4o-mini".into(),
        })
        .build()?;

    let request = RunRequest::new(
        "thread-1",
        vec![Message::user("Say hello using the echo tool")],
    )
    .with_agent_id("assistant");

    let sink = Arc::new(VecEventSink::new());
    runtime.run(request, sink.clone()).await?;

    let events = sink.take();
    println!("events: {}", events.len());

    let finished = events
        .iter()
        .any(|e| matches!(e, AgentEvent::RunFinish { .. }));
    println!("run_finish_seen: {}", finished);

    Ok(())
}

2. Run

cargo run

3. Verify

Expected output includes:

  • events: <n> where n > 0
  • run_finish_seen: true

The event stream will contain at least RunStart, one or more TextDelta or ToolCallStart/ToolCallDone events, and a final RunFinish.

What You Created

This example creates an in-process AgentRuntime and runs one request immediately.

The core object is:

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_tool("echo", Arc::new(EchoTool))
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model("gpt-4o-mini", ModelEntry {
        provider: "openai".into(),
        model_name: "gpt-4o-mini".into(),
    })
    .build()?;

After that, the normal entry point is:

let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
let events = sink.take();

Common usage patterns:

  • one-shot CLI program: construct RunRequest, collect events via VecEventSink, print result
  • application service: wrap runtime.run(...) inside your own app logic
  • HTTP server: store Arc<AgentRuntime> in app state and expose protocol routes

Use the next page based on what you want:

Common Errors

  • Model/provider mismatch: gpt-4o-mini requires a compatible OpenAI-style provider setup.
  • Missing key: set OPENAI_API_KEY or DEEPSEEK_API_KEY before cargo run.
  • Tool not selected: ensure the prompt explicitly asks to use echo.
  • No RunFinish event: check that with_max_rounds is set high enough for the model to complete.

Next

First Tool

Goal

Implement one tool that reads typed state from ToolCallContext during execution.

State is optional. Many tools (API calls, search, shell commands) don’t need state – just implement execute and return a ToolResult.

Prerequisites

[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"

1. Define a StateKey

A StateKey describes one named slot in the state map. It declares the value type, how updates are applied, and the lifetime scope.

use awaken::{StateKey, KeyScope, MergeStrategy};

/// Tracks how many times the greeting tool has been called.
struct GreetCount;

impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;

    type Value = u32;
    type Update = u32;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}
fn main() {}

Key choices:

  • KeyScope::Run – state resets at the start of each run. Use KeyScope::Thread to persist across runs.
  • MergeStrategy::Commutative – safe for concurrent updates. Use Exclusive when only one writer is expected.
  • apply defines how an Update modifies the current Value. Here it increments.

2. Implement the Tool

The tool reads the current count via ctx.state::<GreetCount>() and returns a personalized greeting.

use awaken::{StateKey, KeyScope, MergeStrategy};
struct GreetCount;
impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;
    type Value = u32;
    type Update = u32;
    fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};

struct GreetTool;

#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "Greet", "Greet a user by name")
            .with_parameters(json!({
                "type": "object",
                "properties": {
                    "name": { "type": "string", "description": "Name to greet" }
                },
                "required": ["name"]
            }))
    }

    fn validate_args(&self, args: &Value) -> Result<(), ToolError> {
        args["name"]
            .as_str()
            .filter(|s| !s.is_empty())
            .ok_or_else(|| ToolError::InvalidArguments("name is required".into()))?;
        Ok(())
    }

    async fn execute(
        &self,
        args: Value,
        ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        let name = args["name"].as_str().unwrap_or("world");

        // Read state -- returns None if the key has not been set yet.
        let count = ctx.state::<GreetCount>().copied().unwrap_or(0);

        Ok(ToolResult::success("greet", json!({
            "greeting": format!("Hello, {}!", name),
            "times_greeted": count,
        })).into())
    }
}
fn main() {}

3. Register the Tool

use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::{StateKey, KeyScope, MergeStrategy};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
struct GreetCount;
impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;
    type Value = u32;
    type Update = u32;
    fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "Greet", "Greet a user by name")
    }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        Ok(ToolResult::success("greet", json!({})).into())
    }
}
use awaken::registry_spec::AgentSpec;
use awaken::AgentRuntimeBuilder;

fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent_spec = AgentSpec::new("assistant")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant. Use the greet tool when asked.")
    .with_max_rounds(5);

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_tool("greet", Arc::new(GreetTool))
    .build()?;
Ok(())
}

4. Run

use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::{StateKey, KeyScope, MergeStrategy};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::registry_spec::AgentSpec;
use awaken::AgentRuntimeBuilder;
struct GreetCount;
impl StateKey for GreetCount {
    const KEY: &'static str = "greet_count";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    const SCOPE: KeyScope = KeyScope::Run;
    type Value = u32;
    type Update = u32;
    fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "Greet", "Greet a user by name")
    }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        Ok(ToolResult::success("greet", json!({})).into())
    }
}
use awaken::contract::message::Message;
use awaken::contract::event_sink::VecEventSink;
use awaken::RunRequest;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(AgentSpec::new("assistant").with_model("gpt-4o-mini"))
    .with_tool("greet", Arc::new(GreetTool))
    .build()?;
let request = RunRequest::new(
    "thread-1",
    vec![Message::user("Greet Alice")],
)
.with_agent_id("assistant");

let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
Ok(())
}

5. Verify

Check the collected events for a ToolCallDone event with name == "greet":

use awaken::contract::event::AgentEvent;

let events = sink.take();
let tool_done = events.iter().any(|e| matches!(
    e,
    AgentEvent::ToolCallDone { id: _, message_id: _, result: _, outcome: _ }
));
println!("tool_call_done_seen: {}", tool_done);

Expected:

  • tool_call_done_seen: true
  • The result inside ToolCallDone contains greeting and times_greeted fields.

What You Created

A tool that:

  1. Declares a JSON Schema for its arguments via descriptor().
  2. Validates arguments before execution via validate_args().
  3. Reads typed state from the snapshot via ctx.state::<K>().
  4. Returns structured JSON via ToolResult::success().

The StateKey trait gives you type-safe, scoped state without raw JSON manipulation.

Common Errors

  • ctx.state::<K>() returns None: the state key has not been written yet in this run. Use .unwrap_or_default() or .copied().unwrap_or(0) for numeric defaults.
  • StateError::KeyEncode / StateError::KeyDecode: the Value type does not round-trip through JSON. Ensure Serialize and Deserialize are derived correctly.
  • ToolError::InvalidArguments not surfaced: validate_args is called before execute by the runtime. If you skip validation, bad input reaches execute and may panic on .unwrap().
  • Scope mismatch: KeyScope::Run state is cleared between runs. If you expect persistence, use KeyScope::Thread.

Build Agents

This path is for composing agent behavior after you understand the basics.

  1. Build an Agent to define the runtime, model registry, and agent spec.
  2. Add a Tool and Add a Plugin to extend behavior safely.
  3. Add discovery and delegation layers with Use Skills Subsystem, Use MCP Tools, and Use Agent Handoff.
  4. Add specialized behavior with Use Reminder Plugin, Use Generative UI, and Use Deferred Tools.

Keep nearby

Build an Agent

Use this when you need to assemble an agent with tools, persistence, and a provider into a running AgentRuntime.

Prerequisites

  • awaken crate added to Cargo.toml
  • An LlmExecutor implementation (provider) available
  • Familiarity with AgentSpec and AgentRuntimeBuilder

Steps

  1. Define the agent spec.
use awaken::engine::GenaiExecutor;
use awaken::{AgentSpec, AgentRuntimeBuilder, ModelSpec};

let spec = AgentSpec::new("assistant")
    .with_model("claude-sonnet")
    .with_system_prompt("You are a helpful assistant.")
    .with_max_rounds(10);
  1. Register tools.
use std::sync::Arc;

let builder = AgentRuntimeBuilder::new()
    .with_agent_spec(spec)
    .with_tool("search", Arc::new(SearchTool))
    .with_tool("calculator", Arc::new(CalculatorTool));
  1. Register a provider and a model.
let builder = builder
    .with_provider("anthropic", Arc::new(GenaiExecutor::new()))
    .with_model("claude-sonnet", ModelSpec {
        id: "claude-sonnet".into(),
        provider: "anthropic".into(),
        model: "claude-sonnet-4-20250514".into(),
    });
  1. Attach persistence.
use awaken::stores::InMemoryStore;

let store = Arc::new(InMemoryStore::new());
let builder = builder.with_thread_run_store(store);
  1. Build and validate.
let runtime = builder.build()?;

build resolves every registered agent and catches missing models, providers, or plugins at startup rather than at request time.

  1. Execute a run.
use std::sync::Arc;
use awaken::RunRequest;
use awaken::contract::event_sink::VecEventSink;

let request = RunRequest::new("thread-1", vec![user_message])
    .with_agent_id("assistant");

let sink = Arc::new(VecEventSink::new());
let handle = runtime.run(request, sink.clone()).await?;

Verify

Call the /health endpoint (if using the server feature) or inspect the returned AgentRunResult to confirm the agent loop completed without errors.

Common Errors

ErrorCauseFix
BuildError::ValidationFailedAgent spec references a model or provider not registered in the builderRegister the missing model/provider before calling build
BuildError::StateDuplicate state key registration across pluginsEnsure each StateKey is registered by exactly one plugin
RuntimeError at run timeProvider returns an inference errorCheck provider credentials and model ID

examples/src/research/ – a research agent with search and report-writing tools.

Key Files

  • crates/awaken-runtime/src/builder.rsAgentRuntimeBuilder
  • crates/awaken-contract/src/registry_spec.rsAgentSpec
  • crates/awaken-runtime/src/runtime/agent_runtime/mod.rsAgentRuntime

Add a Tool

Use this when you need to expose a custom capability to the agent by implementing the Tool trait.

Prerequisites

  • awaken crate added to Cargo.toml
  • async-trait and serde_json available

Steps

  1. Implement the Tool trait.
#![allow(unused)]
fn main() {
use async_trait::async_trait;
use serde_json::{Value, json};
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};

async fn fetch_weather(_city: &str) -> Result<String, ToolError> {
    Ok("Sunny, 22°C".to_string())
}

pub struct WeatherTool;

#[async_trait]
impl Tool for WeatherTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("get_weather", "Get Weather", "Fetch current weather for a city")
            .with_parameters(json!({
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["city"]
            }))
    }

    async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        let city = args["city"]
            .as_str()
            .ok_or_else(|| ToolError::InvalidArguments("Missing 'city'".into()))?;

        let weather = fetch_weather(city).await?;

        Ok(ToolResult::success("get_weather", json!({ "forecast": weather })).into())
    }
}
}
  1. Optionally override argument validation.
fn validate_args(&self, args: &Value) -> Result<(), ToolError> {
    if !args.get("city").and_then(|v| v.as_str()).is_some_and(|s| !s.is_empty()) {
        return Err(ToolError::InvalidArguments("'city' must be a non-empty string".into()));
    }
    Ok(())
}

validate_args runs before execute and lets you reject malformed input early.

  1. Register the tool with the builder.
use std::sync::Arc;
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_tool("get_weather", Arc::new(WeatherTool))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

The string ID passed to with_tool must match the id in ToolDescriptor::new.

  1. Register via a plugin (alternative).

    Tools can also be registered inside a Plugin::register method through the PluginRegistrar:

fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError> {
    registrar.register_tool("get_weather", Arc::new(WeatherTool))?;
    Ok(())
}

Plugin-registered tools are scoped to agents that activate that plugin.

Verify

Send a message that should trigger the tool. Inspect the run result to confirm the tool was called and returned the expected output.

Common Errors

ErrorCauseFix
ToolError::InvalidArgumentsThe LLM passed malformed JSONTighten the JSON Schema in with_parameters to guide the model
Tool never calledDescriptor id does not match the registered IDEnsure the ID in ToolDescriptor::new and with_tool are identical
ToolError::ExecutionFailedRuntime error inside executeReturn a descriptive error; the agent will see it and may retry

examples/src/research/tools.rsSearchTool and WriteReportTool implementations.

Key Files

  • crates/awaken-contract/src/contract/tool.rsTool trait, ToolDescriptor, ToolResult, ToolError
  • crates/awaken-runtime/src/builder.rswith_tool registration

Add a Plugin

Use this when you need to extend the agent lifecycle with state keys, phase hooks, scheduled actions, or effect handlers.

Prerequisites

  • awaken crate added to Cargo.toml
  • Familiarity with Phase variants and StateKey

Steps

  1. Define a state key.
#![allow(unused)]
fn main() {
use awaken::{StateKey, MergeStrategy, StateError, JsonValue};
use serde::{Serialize, Deserialize};

#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct AuditLog {
    pub entries: Vec<String>,
}

pub struct AuditLogKey;

impl StateKey for AuditLogKey {
    type Value = AuditLog;
    const KEY: &'static str = "audit_log";
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;

    type Update = AuditLog;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value = update;
    }

    fn encode(value: &Self::Value) -> Result<JsonValue, StateError> {
        serde_json::to_value(value).map_err(|e| StateError::KeyEncode { key: Self::KEY.into(), message: e.to_string() })
    }

    fn decode(json: JsonValue) -> Result<Self::Value, StateError> {
        serde_json::from_value(json).map_err(|e| StateError::KeyDecode { key: Self::KEY.into(), message: e.to_string() })
    }
}
}
  1. Implement a phase hook.
use async_trait::async_trait;
use awaken::{PhaseHook, PhaseContext, StateCommand, StateError};

pub struct AuditHook;

#[async_trait]
impl PhaseHook for AuditHook {
    async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
        let mut log = ctx.state::<AuditLogKey>().cloned().unwrap_or(AuditLog {
            entries: Vec::new(),
        });
        log.entries.push(format!("Phase executed at {:?}", ctx.phase));
        let mut cmd = StateCommand::new();
        cmd.update::<AuditLogKey>(log);
        Ok(cmd)
    }
}
  1. Implement the Plugin trait.
use awaken::{Plugin, PluginDescriptor, PluginRegistrar, Phase, StateError, StateKeyOptions, KeyScope};

pub struct AuditPlugin;

impl Plugin for AuditPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor { name: "audit" }
    }

    fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError> {
        registrar.register_key::<AuditLogKey>(StateKeyOptions {
            scope: KeyScope::Run,
            ..Default::default()
        })?;

        registrar.register_phase_hook(
            "audit",
            Phase::PostInference,
            AuditHook,
        )?;

        Ok(())
    }
}
  1. Register the plugin and activate it on an agent.
use std::sync::Arc;
use awaken::{AgentSpec, AgentRuntimeBuilder};

let spec = AgentSpec::new("assistant")
    .with_model("anthropic/claude-sonnet")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("audit");

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("audit", Arc::new(AuditPlugin))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

with_hook_filter on the agent spec activates the named plugin for that agent.

Verify

Run the agent and inspect the state snapshot. The audit_log key should contain entries added by the hook after each inference phase.

Common Errors

ErrorCauseFix
StateError::KeyAlreadyRegisteredTwo plugins register the same StateKeyUse a unique KEY constant per state key
StateError::UnknownKeyAccessing a key that was never registeredEnsure the plugin calling register_key is activated on the agent
Hook not firingPlugin ID not listed in with_hook_filterAdd the plugin ID to the agent spec’s hook filters

crates/awaken-ext-observability/ – the built-in observability plugin registers phase hooks and state keys.

Key Files

  • crates/awaken-runtime/src/plugins/lifecycle.rsPlugin trait
  • crates/awaken-runtime/src/plugins/registry.rsPluginRegistrar
  • crates/awaken-runtime/src/hooks/phase_hook.rsPhaseHook trait

Use Skills Subsystem

Use this when you want agents to discover and activate skill packages at runtime, loading instructions, resources, and scripts on demand.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • Feature skills enabled on the awaken crate
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["skills"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

  1. Create skill directories.

Each skill lives in a directory with a SKILL.md file containing YAML frontmatter:

skills/
  my-skill/
    SKILL.md
  another-skill/
    SKILL.md

Example skills/my-skill/SKILL.md:

---
name: my-skill
description: Helps with data analysis tasks
allowed-tools: read_file web_search
---
# My Skill

When this skill is active, focus on data analysis.
Use read_file to load datasets and web_search for reference material.
  1. Discover filesystem skills.
use std::sync::Arc;
use awaken::ext_skills::{FsSkill, InMemorySkillRegistry, SkillSubsystem};

let result = FsSkill::discover("./skills").expect("skill discovery failed");
for warning in &result.warnings {
    eprintln!("skill warning: {warning:?}");
}

let skills = FsSkill::into_arc_skills(result.skills);
let registry = Arc::new(InMemorySkillRegistry::from_skills(skills));
  1. Use embedded skills (alternative).

    For compile-time embedded skills, use EmbeddedSkill:

use std::sync::Arc;
use awaken::ext_skills::{EmbeddedSkill, EmbeddedSkillData, InMemorySkillRegistry};

const SKILL_MD: &str = "\
---
name: builtin-skill
description: A compiled-in skill
allowed-tools: read_file
---
Builtin Skill

Follow these instructions when active.
";

let skill = EmbeddedSkill::new(&EmbeddedSkillData {
    skill_md: SKILL_MD,
    references: &[],
    assets: &[],
}).expect("valid skill");

let registry = Arc::new(InMemorySkillRegistry::from_skills(vec![Arc::new(skill)]));
  1. Wire into the runtime.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_skills::SkillSubsystem;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let subsystem = SkillSubsystem::new(registry);
let agent_spec = AgentSpec::new("skills-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Discover and activate skills when specialized help is useful.")
    .with_hook_filter("skills-discovery")
    .with_hook_filter("skills-active-instructions");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin(
        "skills-discovery",
        Arc::new(subsystem.discovery_plugin()) as Arc<dyn Plugin>,
    )
    .with_plugin(
        "skills-active-instructions",
        Arc::new(subsystem.active_instructions_plugin()) as Arc<dyn Plugin>,
    )
    .build()
    .expect("failed to build runtime");

The SkillDiscoveryPlugin injects a skills catalog into the LLM context before inference and registers three tools:

ToolPurpose
skillActivate a skill by name
load_skill_resourceLoad a skill resource or reference
skill_scriptRun a skill script

Verify

  1. Run the agent and ask it to list available skills.
  2. The LLM should see the skills catalog in its context and use the skill tool to activate one.
  3. After activation, the ActiveSkillInstructionsPlugin injects the skill instructions into subsequent inference calls.

Common Errors

SymptomCauseFix
No skills discoveredWrong directory pathEnsure each skill has a SKILL.md in a subdirectory
SkillMaterializeErrorInvalid YAML frontmatterCheck that name and description fields are present in SKILL.md
Skill tools not availablePlugin not registeredRegister both discovery_plugin() and active_instructions_plugin()
Feature not foundMissing cargo featureEnable features = ["skills"] in Cargo.toml
  • crates/awaken-ext-skills/tests/

Key Files

PathPurpose
crates/awaken-ext-skills/src/lib.rsModule root and public re-exports
crates/awaken-ext-skills/src/registry.rsFsSkill, InMemorySkillRegistry, SkillRegistry trait
crates/awaken-ext-skills/src/plugin/subsystem.rsSkillSubsystem facade
crates/awaken-ext-skills/src/plugin/discovery.rsSkillDiscoveryPlugin
crates/awaken-ext-skills/src/embedded.rsEmbeddedSkill for compile-time skills
crates/awaken-ext-skills/src/tools.rsSkill tool implementations

Use MCP Tools

Use this when you want to connect to external Model Context Protocol (MCP) servers and expose their tools to awaken agents.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • Feature mcp enabled on the awaken crate
  • An MCP server to connect to (stdio or HTTP transport)
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["mcp"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

  1. Configure MCP server connections.
use awaken::ext_mcp::McpServerConnectionConfig;

// Stdio transport: launch a child process
let stdio_config = McpServerConnectionConfig::stdio(
    "my-mcp-server",          // server name
    "node",                    // command
    vec!["server.js".into()],  // args
);

// HTTP/SSE transport: connect to a running server
let http_config = McpServerConnectionConfig::http(
    "remote-server",
    "http://localhost:8080/sse",
);
  1. Create the registry manager and discover tools.
use awaken::ext_mcp::McpToolRegistryManager;

let manager = McpToolRegistryManager::connect(vec![stdio_config, http_config])
    .await
    .expect("failed to connect MCP servers");

// Tools are now available as awaken Tool instances.
// Each MCP tool is registered with the ID format: mcp__{server}__{tool}
let registry = manager.registry();
for id in registry.ids() {
    println!("discovered: {id}");
}
  1. Register tools with the runtime.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_mcp::McpPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let agent_spec = AgentSpec::new("mcp-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Use MCP tools when they help answer the user.")
    .with_hook_filter("mcp");

let mut builder = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("mcp", Arc::new(McpPlugin) as Arc<dyn Plugin>);

let registry = manager.registry();
for (id, tool) in registry.snapshot() {
    builder = builder.with_tool(&id, tool);
}

let runtime = builder.build().expect("failed to build runtime");
  1. Enable periodic refresh (optional).

    MCP servers may add or remove tools at runtime. Enable periodic refresh to keep the tool registry in sync:

use std::time::Duration;

manager.start_periodic_refresh(Duration::from_secs(60));

Verify

  1. Run the agent and ask it to use a tool provided by the MCP server.
  2. Check the backend logs for MCP tool call events.
  3. Confirm the tool result includes mcp.server and mcp.tool metadata in the response.

Common Errors

SymptomCauseFix
McpError::TransportErrorMCP server not running or unreachableVerify the server process is running and the path/URL is correct
No tools discoveredServer returned empty tool listCheck the MCP server implements tools/list
Tool call timeoutServer too slow to respondIncrease timeout in the transport configuration
Feature not foundMissing cargo featureEnable features = ["mcp"] in Cargo.toml
mcp__server__tool not foundTools not registered with builderLoop over manager.registry().snapshot() and call with_tool for each
  • crates/awaken-ext-mcp/tests/

Key Files

PathPurpose
crates/awaken-ext-mcp/src/lib.rsModule root and public re-exports
crates/awaken-ext-mcp/src/manager.rsMcpToolRegistryManager lifecycle and tool wrapping
crates/awaken-ext-mcp/src/config.rsMcpServerConnectionConfig transport types
crates/awaken-ext-mcp/src/plugin.rsMcpPlugin integration with awaken plugin system
crates/awaken-ext-mcp/src/transport.rsMcpToolTransport trait and transport helpers
crates/awaken-ext-mcp/tests/mcp_tests.rsIntegration tests

Use the Reminder Plugin

Use this when you want the agent to receive automatic context messages after tool execution based on pattern matching – for example, reminding the agent to run cargo check after editing a .toml file, or warning about destructive commands.

Prerequisites

  • A working awaken agent runtime (see Build an Agent)
  • Feature reminder enabled on the awaken crate
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["reminder"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

  1. Register the reminder plugin with rules.
use std::sync::Arc;
use awaken::{AgentRuntimeBuilder, Plugin};
use awaken::ext_reminder::{ReminderPlugin, ReminderRulesConfig};

let json = r#"{
    "rules": [
        {
            "tool": "Bash(command ~ 'rm *')",
            "output": { "status": "success" },
            "message": {
                "target": "suffix_system",
                "content": "A deletion command just succeeded. Verify the result."
            }
        }
    ]
}"#;

let config = ReminderRulesConfig::from_str(json, Some("json"))
    .expect("failed to parse reminder config");
let rules = config.into_rules().expect("invalid rules");

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_plugin("reminder", Arc::new(ReminderPlugin::new(rules)) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

The plugin registers an AfterToolExecute phase hook. After every tool call, it evaluates each rule against the tool name, arguments, and result. When a rule matches, it schedules an AddContextMessage action that injects the configured message into the prompt.

  1. Define rules with tool patterns.

    The tool field uses the same pattern DSL as the permission system:

PatternMatches
"Bash"Exact tool name Bash
"*"Any tool
"mcp__*"Glob on tool name (all MCP tools)
"Bash(command ~ 'rm *')"Tool name + primary argument glob
"Edit(file_path ~ '*.toml')"Named field glob
let json = r#"{
    "rules": [
        {
            "name": "toml-edit-reminder",
            "tool": "Edit(file_path ~ '*.toml')",
            "output": "any",
            "message": {
                "target": "system",
                "content": "You edited a TOML file. Run cargo check to verify.",
                "cooldown_turns": 3
            }
        }
    ]
}"#;

The optional name field gives the rule a human-readable identifier. When omitted, one is auto-generated from the index and tool pattern (e.g. rule-0-Edit(file_path ~ '*.toml')).

  1. Configure output matching.

    The output field controls whether the rule fires based on the tool result. Set it to "any" to match all outputs, or use a structured object with status and/or content:

// Match only errors containing "permission denied"
let json = r#"{
    "rules": [
        {
            "tool": "*",
            "output": {
                "status": "error",
                "content": "*permission denied*"
            },
            "message": {
                "target": "suffix_system",
                "content": "Permission denied. Consider using sudo or checking file ownership."
            }
        }
    ]
}"#;

Status values: "success", "error", "pending", "any".

Content matching supports two forms:

  • Text glob (string shorthand): "*permission denied*" – matches the stringified tool output against a glob pattern.
  • JSON fields (structured): matches specific fields in JSON output.
// JSON field matching
let json = r#"{
    "rules": [
        {
            "tool": "*",
            "output": {
                "status": "error",
                "content": {
                    "fields": [
                        { "path": "error.code", "op": "exact", "value": "403" }
                    ]
                }
            },
            "message": {
                "target": "suffix_system",
                "content": "HTTP 403 Forbidden. Check authentication credentials."
            }
        }
    ]
}"#;

Field match operations: "glob" (default), "exact", "regex", "not_glob", "not_exact", "not_regex". The path uses dot-separated JSON field names (e.g. "error.code"). When multiple fields are specified, all must match (AND logic).

When both status and content are present, both must match for the rule to fire.

  1. Choose a message target.

    The target field determines where the injected message appears in the prompt:

TargetPlacement
"system"Prepended to the system prompt section
"suffix_system"Appended after the system prompt
"session"Inserted as a session-scoped system message
"conversation"Inserted as a conversation-scoped system message
  1. Use cooldown to avoid repetition.

    Set cooldown_turns on the message to suppress re-injection for a number of turns after firing:

let json = r#"{
    "rules": [
        {
            "tool": "*",
            "output": "any",
            "message": {
                "target": "system",
                "content": "Remember to be careful with file operations.",
                "cooldown_turns": 5
            }
        }
    ]
}"#;

When cooldown_turns is 0 (default), the message is injected on every match.

  1. Load rules from a file.
use awaken::ext_reminder::ReminderRulesConfig;

let config = ReminderRulesConfig::from_file("reminders.json")
    .expect("failed to load reminder config");
let rules = config.into_rules().expect("invalid rules");
  1. Activate the plugin on an agent spec.
use awaken::registry_spec::AgentSpec;

let agent_spec = AgentSpec::new("my-agent")
    .with_model("anthropic/claude-sonnet")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("reminder");

The with_hook_filter("reminder") call activates the reminder plugin’s phase hook for this agent.

Verify

  1. Configure a rule matching a tool you can easily trigger (e.g. "*" with "any" output).
  2. Run the agent and invoke a tool.
  3. Inspect the prompt or enable tracing at debug level. You should see:
    reminder rule matched, scheduling context message
    
  4. Confirm the context message appears in the agent’s next inference prompt at the expected target location.

Common Errors

SymptomCauseFix
InvalidPattern on startupMalformed tool pattern stringCheck syntax against the pattern table above; ensure quotes are escaped in JSON
InvalidTarget on startupUnknown message targetUse one of: system, suffix_system, session, conversation
InvalidOutput on startupUnrecognized output matcher stringUse "any" or a structured { "status": ..., "content": ... } object
InvalidOp on startupUnknown field match operationUse one of: glob, exact, regex, not_glob, not_exact, not_regex
Rule never firesPlugin not activated on agentAdd with_hook_filter("reminder") to the agent spec
Rule fires too oftenNo cooldown configuredSet cooldown_turns to a positive value
  • crates/awaken-ext-reminder/src/config.rs – contains test cases with various rule configurations

Key Files

PathPurpose
crates/awaken-ext-reminder/src/lib.rsModule root and public re-exports
crates/awaken-ext-reminder/src/config.rsReminderRulesConfig, JSON loading, ReminderConfigKey
crates/awaken-ext-reminder/src/rule.rsReminderRule struct definition
crates/awaken-ext-reminder/src/output_matcher.rsOutputMatcher, ContentMatcher, status/content matching logic
crates/awaken-ext-reminder/src/plugin/plugin.rsReminderPlugin registration (AfterToolExecute hook)
crates/awaken-ext-reminder/src/plugin/hook.rsReminderHook – pattern + output evaluation per tool call
crates/awaken-tool-pattern/Shared glob/regex pattern matching library

Use Generative UI (A2UI)

Use this when you want the agent to send declarative UI components to a frontend – for example, rendering a form, a data table, or an interactive card without the frontend knowing the layout in advance.

Prerequisites

  • A working awaken agent runtime (see Build an Agent)
  • A frontend that consumes A2UI messages from the event stream (e.g. a CopilotKit or AI SDK integration)
  • A component catalog registered on the frontend that defines available UI components
[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

  1. Register the A2UI plugin.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_generative_ui::A2uiPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let plugin = A2uiPlugin::with_catalog_id("my-catalog");
let agent_spec = AgentSpec::new("ui-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Render structured UI when visual output helps.")
    .with_hook_filter("generative-ui");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("generative-ui", Arc::new(plugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

The plugin registers a tool called render_a2ui that the LLM can invoke. When the LLM calls this tool with an array of A2UI messages, the tool validates the message structure and returns the validated payload, which flows through the event stream to the frontend.

  1. Understand the A2UI protocol.

    A2UI v0.8 defines four message types. Each message is a JSON object with "version": "v0.8" and exactly one of the following keys:

Message TypePurpose
createSurfaceInitialize a new rendering surface
updateComponentsDefine or update the component tree
updateDataModelPopulate or change data values
deleteSurfaceRemove a surface

Messages must be sent in order: create the surface first, then define components, then populate data.

  1. Create a surface.

    Every UI starts by creating a surface with a unique ID and a reference to the frontend’s component catalog:

// The LLM sends this via the render_a2ui tool:
let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "createSurface": {
                "surfaceId": "order-form-1",
                "catalogId": "my-catalog"
            }
        }
    ]
});

The catalogId tells the frontend which set of component definitions to use for rendering.

  1. Define the component tree.

    Components are specified as a flat list with adjacency references. Each component has a unique id and a component type name from the catalog. Parent-child relationships are expressed via child (single child) or children (multiple children):

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "updateComponents": {
                "surfaceId": "order-form-1",
                "components": [
                    {
                        "id": "root",
                        "component": "Card",
                        "child": "layout"
                    },
                    {
                        "id": "layout",
                        "component": "Column",
                        "children": ["title", "name-field", "submit-btn"]
                    },
                    {
                        "id": "title",
                        "component": "Text",
                        "text": "New Order"
                    },
                    {
                        "id": "name-field",
                        "component": "TextField",
                        "label": "Customer Name",
                        "value": { "path": "/customer/name" }
                    },
                    {
                        "id": "submit-btn",
                        "component": "Button",
                        "label": "Submit",
                        "action": {
                            "event": {
                                "name": "submit-order",
                                "context": { "formId": "order-form-1" }
                            }
                        }
                    }
                ]
            }
        }
    ]
});

Rules for the component list:

  • One component must have "id": "root" – this is the tree’s entry point.
  • Every component requires "id" and "component" fields.
  • Use "child" for a single child reference or "children" for multiple.
  • Additional properties (text, label, action, and any extra fields) are passed through to the frontend renderer.
  1. Bind data with JSON paths.

    Use {"path": "/json/pointer"} in component properties to bind values to the surface’s data model. The frontend resolves these paths against the data model at render time:

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "updateDataModel": {
                "surfaceId": "order-form-1",
                "path": "/",
                "value": {
                    "customer": {
                        "name": "",
                        "email": ""
                    },
                    "items": []
                }
            }
        }
    ]
});

The path field specifies which part of the data model to update. Use "/" to set the entire model, or a nested path like "/customer/name" to update a specific value.

  1. Delete a surface.
let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "deleteSurface": {
                "surfaceId": "order-form-1"
            }
        }
    ]
});
  1. Send multiple messages in one tool call.

    The render_a2ui tool accepts an array of messages, so you can create a surface, define components, and populate data in a single call:

let messages = serde_json::json!({
    "messages": [
        {
            "version": "v0.8",
            "createSurface": { "surfaceId": "s1", "catalogId": "my-catalog" }
        },
        {
            "version": "v0.8",
            "updateComponents": {
                "surfaceId": "s1",
                "components": [
                    { "id": "root", "component": "Text", "text": "Hello" }
                ]
            }
        },
        {
            "version": "v0.8",
            "updateDataModel": { "surfaceId": "s1", "path": "/", "value": {} }
        }
    ]
});
  1. Customize plugin instructions.

    The plugin injects prompt instructions that teach the LLM how to use the render_a2ui tool. You can customize these in several ways:

// With catalog ID and custom examples appended to the default instructions
let plugin = A2uiPlugin::with_catalog_and_examples(
    "my-catalog",
    "Example: create a card with a title and a button..."
);

// With fully custom instructions (replaces the default instructions entirely)
let plugin = A2uiPlugin::with_custom_instructions(
    "You can render UI by calling render_a2ui...".to_string()
);

Verify

  1. Register the A2UI plugin and run the agent with a prompt that asks it to display information visually.
  2. The agent should call the render_a2ui tool with valid A2UI messages.
  3. Check the tool result in the event stream – a successful call returns {"a2ui": [...], "rendered": true}.
  4. On the frontend, confirm the surface appears with the expected components.

Common Errors

SymptomCauseFix
missing required field "messages"Tool called without a messages arrayEnsure the LLM sends {"messages": [...]}
messages array must not be emptyEmpty messages arrayInclude at least one A2UI message
unsupported versionVersion field is not "v0.8"Set "version": "v0.8" on every message
multiple message types in one objectA single message contains more than one type keyEach message object must have exactly one of createSurface, updateComponents, updateDataModel, or deleteSurface
components[N].id is requiredA component in updateComponents is missing idAdd "id" to every component object
components[N].component is requiredA component is missing the type nameAdd "component" with a valid catalog type
LLM does not call the toolPlugin registered but instructions not reaching the LLMVerify the plugin is activated on the agent spec
  • crates/awaken-ext-generative-ui/src/a2ui/tests.rs – validation and tool execution test cases

Key Files

PathPurpose
crates/awaken-ext-generative-ui/src/a2ui/mod.rsA2UI module root, constants, re-exports
crates/awaken-ext-generative-ui/src/a2ui/plugin.rsA2uiPlugin registration and prompt instructions
crates/awaken-ext-generative-ui/src/a2ui/tool.rsA2uiRenderTool – validation and execution
crates/awaken-ext-generative-ui/src/a2ui/types.rsA2uiMessage, A2uiComponent, and related structs
crates/awaken-ext-generative-ui/src/a2ui/validation.rsvalidate_a2ui_messages structural checks

Use Agent Handoff

Use this when you need to dynamically switch an agent’s behavior (system prompt, model, or tool set) within a running agent loop without terminating the run or spawning a new thread.

Prerequisites

  • awaken crate added to Cargo.toml
  • Familiarity with Plugin, StateKey, and AgentRuntimeBuilder

Overview

Handoff performs dynamic same-thread agent variant switching. Instead of ending the current run and starting a new agent, handoff applies an AgentOverlay that overrides parts of the base agent configuration. The switch is instant – no run termination or re-resolution occurs.

Key types:

  • HandoffPlugin – the plugin that manages overlay registration and lifecycle hooks.
  • AgentOverlay – per-variant overrides for system prompt, model, and tool filters.
  • HandoffState – tracks the active variant and any pending handoff request.
  • HandoffAction – reducer actions: Request, Activate, Clear.

Steps

  1. Define agent variant overlays.

Each overlay specifies which parts of the base agent configuration to override. Fields left as None inherit the base agent’s values.

The model field uses the same model registry ID that AgentSpec.model uses.

use awaken::extensions::handoff::AgentOverlay;

let researcher = AgentOverlay {
    system_prompt: Some("You are a research specialist. Find and cite sources.".into()),
    model: Some("claude-sonnet".into()),
    allowed_tools: Some(vec!["web_search".into(), "read_document".into()]),
    excluded_tools: None,
};

let writer = AgentOverlay {
    system_prompt: Some("You are a technical writer. Produce clear documentation.".into()),
    model: None, // inherits base model
    allowed_tools: None, // all tools available
    excluded_tools: Some(vec!["web_search".into()]),
};
  1. Build a HandoffPlugin with variant overlays.
use std::collections::HashMap;
use std::sync::Arc;
use awaken::extensions::handoff::HandoffPlugin;

let mut overlays = HashMap::new();
overlays.insert("researcher".to_string(), researcher);
overlays.insert("writer".to_string(), writer);

let handoff = HandoffPlugin::new(overlays);
  1. Register the plugin on the runtime builder.
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("agent_handoff", Arc::new(handoff))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

The plugin ID must be "agent_handoff" (exported as HANDOFF_PLUGIN_ID). The plugin registers hooks on Phase::RunStart and Phase::StepEnd to synchronize handoff state.

  1. Request a handoff from within a tool or hook.

Use the action helpers to create HandoffAction mutations and dispatch them through a StateCommand:

use awaken::extensions::handoff::{request_handoff, activate_handoff, clear_handoff, ActiveAgentKey};
use awaken::state::StateCommand;

// Request a switch to the "researcher" variant (pending until next phase boundary)
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(request_handoff("researcher"));

// Directly activate a variant (skips the request step)
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(activate_handoff("writer"));

// Clear handoff state and return to the base agent
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(clear_handoff());
  1. Look up overlays from plugin state.

The plugin exposes a method to retrieve the overlay for any registered variant:

let overlay = handoff.overlay("researcher");
// Returns Option<&AgentOverlay>

The effective agent ID is determined by HandoffPlugin::effective_agent, which returns the requested variant if one is pending, otherwise the currently active variant:

let state: &HandoffState = /* from context */;
let agent_id = HandoffPlugin::effective_agent(state);
// Returns Option<&String> -- None means the base agent is active

How It Works

HandoffState has two fields:

  • active_agent: Option<String> – the currently active variant (None = base agent).
  • requested_agent: Option<String> – a pending handoff request, consumed at the next phase boundary.

The internal HandoffSyncHook runs at RunStart and StepEnd. When it detects a requested_agent, it promotes the request to active_agent and clears the request. This two-phase approach ensures the switch happens at a safe boundary in the agent loop.

Handoff vs Delegation

HandoffDelegation
ThreadSame thread, same runSpawns a sub-agent on a separate thread
StateShared – overlays modify the current agent in-placeIsolated – delegate has its own state
Use caseSwitching personas or tool sets mid-conversationOffloading a self-contained subtask
OverheadZero – no run restartHigher – new run lifecycle

Use handoff when you want the agent to change behavior while retaining conversational context. Use delegation when the subtask is independent and the delegate should not see or modify the parent’s state.

Common Errors

ErrorCauseFix
Overlay not appliedVariant name in request_handoff does not match a key in the overlays mapEnsure the string matches exactly
StateError::KeyAlreadyRegisteredAnother plugin registers the ActiveAgentKeyOnly one HandoffPlugin should be registered per runtime
Hook not firingPlugin not in the agent’s active hook filterAdd "agent_handoff" to active_hook_filter or leave the filter empty

Key Files

  • crates/awaken-runtime/src/extensions/handoff/mod.rs – module root and public exports
  • crates/awaken-runtime/src/extensions/handoff/plugin.rsHandoffPlugin implementation
  • crates/awaken-runtime/src/extensions/handoff/types.rsAgentOverlay struct
  • crates/awaken-runtime/src/extensions/handoff/state.rsHandoffState and ActiveAgentKey
  • crates/awaken-runtime/src/extensions/handoff/action.rsHandoffAction and helper functions

Use Deferred Tools

Use this when your agent has many tools and you want to reduce context window usage by hiding tool schemas from the LLM until they are needed. The deferred-tools plugin classifies tools as Eager (always sent) or Deferred (hidden until requested). A ToolSearch tool lets the LLM discover deferred tools on demand.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • The awaken-ext-deferred-tools crate
[dependencies]
awaken-ext-deferred-tools = { version = "0.1" }
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

  1. Create the plugin and register it.

Collect all tool descriptors your agent exposes, then pass them to DeferredToolsPlugin::new. The plugin uses these to classify tools and populate the deferred registry at activation time.

use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
use awaken_ext_deferred_tools::DeferredToolsPlugin;

// Collect descriptors from all tools registered on the agent.
let seed_tools = vec![
    weather_tool.descriptor(),
    search_tool.descriptor(),
    debug_tool.descriptor(),
    // ... all tool descriptors
];
let agent_spec = AgentSpec::new("deferred-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("Search for tools only when needed.")
    .with_hook_filter("ext-deferred-tools");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin(
        "ext-deferred-tools",
        Arc::new(DeferredToolsPlugin::new(seed_tools)) as Arc<dyn Plugin>,
    )
    .build()
    .expect("failed to build runtime");
  1. Configure tool loading rules.

Set rules on the agent spec via the deferred_tools config key. Rules are evaluated in order and first match wins. Tools that match no rule fall back to default_mode.

use awaken_ext_deferred_tools::{
    DeferredToolsConfig, DeferralRule, ToolLoadMode,
};

let config = DeferredToolsConfig {
    rules: vec![
        DeferralRule { tool: "get_weather".into(), mode: ToolLoadMode::Eager },
        DeferralRule { tool: "debug_*".into(), mode: ToolLoadMode::Deferred },
    ],
    default_mode: ToolLoadMode::Deferred,
    ..Default::default()
};

The tool field supports exact names and glob patterns (via wildcard_match). Common patterns:

PatternMatches
get_weatherExact tool ID
debug_*Any tool starting with debug_
mcp__github__*All GitHub MCP tools
  1. Understand auto-enable.

The enabled field on DeferredToolsConfig controls activation:

ValueBehavior
Some(true)Always enable deferred tools
Some(false)Always disable
None (default)Auto-enable when total token savings across deferred tools exceeds beta_overhead (default 1136 tokens)

With auto-enable, the plugin estimates per-tool schema cost as len(parameters_json) / 4 tokens and sums the savings for all deferrable tools. If the total exceeds the overhead of maintaining the ToolSearch tool and deferred tool list in context, deferral activates automatically.

  1. Understand how ToolSearch works.

The plugin automatically registers a ToolSearch tool. The LLM calls it with a query string to find deferred tools:

Query formatBehavior
"select:Tool1,Tool2"Fetch specific tools by exact ID
"+required rest terms"Require a keyword, rank by remaining terms
"plain keywords"General keyword search across id, name, description

When ToolSearch returns results, matched tools are promoted to Eager for the remainder of the session. The tool returns up to max_results (default 5) matching tool schemas in a <functions> block.

LLM: I need to check the weather. Let me search for relevant tools.
     -> calls ToolSearch { query: "weather forecast" }

ToolSearch returns matching schemas, promotes get_weather to Eager.
Next inference: get_weather schema is included in the tool list.
  1. Configure automatic re-deferral (DiscBeta).

The plugin tracks per-tool usage statistics using a discounted Beta distribution. Tools that were promoted via ToolSearch but stop being used get automatically re-deferred. This is adaptive and requires no manual tuning.

Re-deferral triggers when all of the following are true:

  • The tool is currently Eager (was promoted from Deferred)
  • The tool is not configured as always-Eager in rules
  • The tool has been idle for at least defer_after turns
  • The posterior upper credible interval (90%) falls below breakeven_p * thresh_mult

The breakeven frequency is (c - c_bar) / gamma, where c is the full schema cost and c_bar is the name-only cost. A tool is worth keeping eager only if it is used often enough that the savings from avoiding ToolSearch calls exceed the per-turn overhead.

Key parameters in DiscBetaParams:

ParameterDefaultPurpose
omega0.95Discount factor per turn. Effective memory is approximately 1/(1-omega) = 20 turns
n05.0Prior strength in equivalent observations
defer_after5Minimum idle turns before considering re-deferral
thresh_mult0.5Multiplier on breakeven frequency for the deferral threshold
gamma2000.0Estimated token cost of a ToolSearch call

These live under DeferredToolsConfig.disc_beta:

use awaken_ext_deferred_tools::{DeferredToolsConfig, DiscBetaParams};

let config = DeferredToolsConfig {
    disc_beta: DiscBetaParams {
        omega: 0.95,
        n0: 5.0,
        defer_after: 5,
        thresh_mult: 0.5,
        gamma: 2000.0,
    },
    beta_overhead: 1136.0,
    ..Default::default()
};
  1. Enable cross-session learning.

Via AgentToolPriors (a ProfileKey), usage frequencies persist across sessions using EWMA (exponentially weighted moving average). At session end, the PersistPriorsHook writes per-tool presence frequencies to the profile store. At next session start, the LoadPriorsHook reads them back and initializes the Beta distribution with learned priors instead of the default 0.01.

This requires a ProfileStore to be configured on the runtime. No additional code is needed beyond the plugin registration — the hooks are wired automatically.

The EWMA smoothing factor is lambda = max(0.1, 1/(n+1)), where n is the session count. Early sessions contribute equally; after 10 sessions the factor stabilizes at 0.1, giving 90% weight to historical data.

Verify

  1. Run the agent and trigger an inference. Check logs for the deferred_tools.list context message, which lists all deferred tool names.

  2. Read DeferralState from the runtime snapshot to see the current mode of each tool:

use awaken_ext_deferred_tools::state::{DeferralState, DeferralStateValue};

let state: &DeferralStateValue = snapshot.state::<DeferralState>()
    .expect("DeferralState not found");

for (tool_id, mode) in &state.modes {
    println!("{tool_id}: {mode:?}");
}
  1. Ask the LLM a question that requires a deferred tool. Confirm ToolSearch is called and the tool is promoted to Eager in subsequent turns.

  2. After several turns of inactivity, verify re-deferral by checking that the tool reverts to Deferred mode in the snapshot.

Common Errors

SymptomCauseFix
All tools sent to LLM (no deferral)enabled: Some(false) or total savings below beta_overheadSet enabled: Some(true) or add more tools so savings exceed overhead
Plugin registers but no tools deferredAll rules resolve to EagerSet default_mode: ToolLoadMode::Deferred or add Deferred rules
ToolSearch not available to LLMPlugin not registeredRegister DeferredToolsPlugin with seed tool descriptors
Tools never re-deferreddefer_after too high or tool usage is frequentLower defer_after or increase thresh_mult
Cross-session priors not loadingNo ProfileStore configuredWire a profile store into the runtime
ToolSearch returns no resultsTool not in deferred registryCheck that the tool was in the seed_tools list passed to the plugin

Key Files

PathPurpose
crates/awaken-ext-deferred-tools/src/lib.rsModule root and public re-exports
crates/awaken-ext-deferred-tools/src/config.rsDeferredToolsConfig, DeferralRule, ToolLoadMode, DiscBetaParams
crates/awaken-ext-deferred-tools/src/plugin/plugin.rsDeferredToolsPlugin registration
crates/awaken-ext-deferred-tools/src/plugin/hooks.rsPhase hooks (BeforeInference, AfterToolExecute, AfterInference, RunStart, RunEnd)
crates/awaken-ext-deferred-tools/src/tool_search.rsToolSearchTool implementation and query parsing
crates/awaken-ext-deferred-tools/src/policy.rsConfigOnlyPolicy and DiscBetaEvaluator
crates/awaken-ext-deferred-tools/src/state.rsState keys: DeferralState, DeferralRegistry, DiscBetaState, ToolUsageStats, AgentToolPriors

Serve & Integrate

This path is for turning a local runtime into something other systems can call.

Start here

  1. Expose HTTP SSE to put the runtime behind HTTP and streaming endpoints.
  2. Integrate AI SDK Frontend for React clients that speak AI SDK v6.
  3. Integrate CopilotKit (AG-UI) for CopilotKit frontends.

Reference pages to pair with this section

Expose HTTP with SSE

Use this when you need to serve agents over HTTP with Server-Sent Events streaming, supporting multiple protocol adapters (AI SDK, AG-UI, A2A).

Prerequisites

  • awaken crate with the server feature enabled
  • tokio with rt-multi-thread and signal features
  • A built AgentRuntime

Steps

  1. Add the dependency.
[dependencies]
awaken = { package = "awaken-agent", version = "...", features = ["server"] }
tokio = { version = "1", features = ["rt-multi-thread", "macros", "signal"] }
  1. Build the runtime.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::{AgentRuntimeBuilder, AgentSpec, ModelSpec};
use awaken::stores::InMemoryStore;

let store = Arc::new(InMemoryStore::new());

let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(
        AgentSpec::new("assistant")
            .with_model("claude-sonnet")
            .with_system_prompt("You are a helpful assistant."),
    )
    .with_tool("search", Arc::new(SearchTool))
    .with_provider("anthropic", Arc::new(GenaiExecutor::new()))
    .with_model("claude-sonnet", ModelSpec {
        id: "claude-sonnet".into(),
        provider: "anthropic".into(),
        model: "claude-sonnet-4-20250514".into(),
    })
    .with_thread_run_store(store.clone())
    .build()?;

let runtime = Arc::new(runtime);
  1. Create the application state.
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::stores::InMemoryMailboxStore;

let mailbox_store = Arc::new(InMemoryMailboxStore::new());
let mailbox = Arc::new(Mailbox::new(
    runtime.clone(),
    mailbox_store,
    "default-consumer".to_string(),
    MailboxConfig::default(),
));

let state = AppState::new(
    runtime.clone(),
    mailbox,
    store,
    runtime.resolver_arc(),
    ServerConfig::default(),
);

ServerConfig::default() binds to 0.0.0.0:3000 with an SSE buffer size of 64.

  1. Build the router.
use awaken::server::routes::build_router;

let app = build_router().with_state(state);

build_router registers all route groups:

  • /health – health check
  • /v1/threads – thread CRUD and messaging
  • /v1/runs and /v1/threads/:id/runs – run APIs
  • /v1/config/* and /v1/capabilities – config and capabilities APIs
  • Protocol adapters: AI SDK v6, AG-UI, A2A, MCP
  1. Start the server.
let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
axum::serve(listener, app).await?;
  1. Configure SSE buffer size.
let config = ServerConfig {
    address: "0.0.0.0:8080".into(),
    sse_buffer_size: 128,
    ..ServerConfig::default()
};

Increase sse_buffer_size if clients consume events slowly and you observe dropped messages.

Verify

curl http://localhost:3000/health

Should return 200 OK. Then create a thread and run:

curl -X POST http://localhost:3000/v1/threads \
  -H "Content-Type: application/json" \
  -d '{}'

Common Errors

ErrorCauseFix
Address already in usePort 3000 is occupiedChange the bind address in ServerConfig or TcpListener::bind
SSE stream closes immediatelyClient does not support text/event-streamUse curl with --no-buffer or an SSE-compatible client
Missing protocol routesFeature flags not enabledEnsure server feature is enabled on the awaken crate

crates/awaken-server/tests/run_api.rs – integration tests demonstrating thread creation, run execution, and SSE streaming.

Key Files

  • crates/awaken-server/src/app.rsAppState, ServerConfig
  • crates/awaken-server/src/routes.rsbuild_router and route definitions
  • crates/awaken-server/src/http_sse.rs – SSE response helpers
  • crates/awaken-server/src/mailbox.rsMailbox run queue

Integrate AI SDK Frontend

Use this when you have a Vercel AI SDK (v6) React frontend and need to connect it to an awaken agent server.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • Feature server enabled on the awaken crate
  • Node.js project with @ai-sdk/react installed
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["server"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
tracing-subscriber = "0.3"

Steps

  1. Build the backend server.
use std::sync::Arc;

use awaken::engine::GenaiExecutor;
use awaken::contract::storage::ThreadRunStore;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use awaken::AgentRuntimeBuilder;
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::server::routes::build_router;

#[tokio::main]
async fn main() {
    tracing_subscriber::fmt().with_target(true).init();

    let agent_spec = AgentSpec::new("my-agent")
        .with_model("gpt-4o-mini")
        .with_system_prompt("You are a helpful assistant.")
        .with_max_rounds(10);

    let runtime = AgentRuntimeBuilder::new()
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model(
            "gpt-4o-mini",
            ModelSpec {
                id: "gpt-4o-mini".into(),
                provider: "openai".into(),
                model: "gpt-4o-mini".into(),
            },
        )
        .with_agent_spec(agent_spec)
        .build()
        .expect("failed to build runtime");
    let runtime = Arc::new(runtime);

    let store = Arc::new(InMemoryStore::new());
    let resolver = runtime.resolver_arc();

    let mailbox_store = Arc::new(InMemoryMailboxStore::new());
    let mailbox = Arc::new(Mailbox::new(
        runtime.clone(),
        mailbox_store as Arc<dyn awaken::contract::MailboxStore>,
        format!("ai-sdk:{}", std::process::id()),
        MailboxConfig::default(),
    ));

    let state = AppState::new(
        runtime,
        mailbox,
        store as Arc<dyn ThreadRunStore>,
        resolver,
        ServerConfig {
            address: "127.0.0.1:3000".into(),
            ..Default::default()
        },
    );

    let app = build_router().with_state(state);

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .expect("failed to bind");
    axum::serve(listener, app).await.expect("server crashed");
}

The server automatically registers AI SDK v6 routes at:

  • POST /v1/ai-sdk/chat – create a new run and stream events
  • GET /v1/ai-sdk/chat/:thread_id/stream – resume an existing stream by thread ID
  • GET /v1/ai-sdk/threads/:thread_id/stream – alias for thread-based resume
  • GET /v1/ai-sdk/threads/:id/messages – retrieve thread messages
  1. Connect the React frontend.

    Install the AI SDK React package:

npm install ai @ai-sdk/react

Use the useChat hook pointed at your awaken server:

import { useChat } from "@ai-sdk/react";

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat({
    api: "http://localhost:3000/v1/ai-sdk/chat",
    id: "thread-1",
  });

  return (
    <div>
      {messages.map((m) => (
        <div key={m.id}>
          <strong>{m.role}:</strong> {m.content}
        </div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        <button type="submit">Send</button>
      </form>
    </div>
  );
}
  1. Run both sides.
# Terminal 1: backend
cargo run

# Terminal 2: frontend
npm run dev

Verify

  1. Open the frontend in a browser.
  2. Send a message.
  3. Confirm that streaming text appears incrementally.
  4. Check the backend logs for RunStart and RunFinish events.

Common Errors

SymptomCauseFix
CORS error in browserNo CORS middlewareAdd tower-http CORS layer to the axum router
useChat receives no eventsWrong endpoint URLConfirm the api prop points to /v1/ai-sdk/chat
stream closed unexpectedlySSE buffer overflowIncrease sse_buffer_size in ServerConfig
404 on /v1/ai-sdk/chatMissing server featureEnable features = ["server"] in Cargo.toml
  • examples/ai-sdk-starter/agent/src/main.rs

Key Files

PathPurpose
crates/awaken-server/src/protocols/ai_sdk_v6/http.rsAI SDK v6 route handlers
crates/awaken-server/src/protocols/ai_sdk_v6/encoder.rsAI SDK v6 SSE event encoder
crates/awaken-server/src/routes.rsUnified router builder
crates/awaken-server/src/app.rsAppState and ServerConfig
examples/ai-sdk-starter/agent/src/main.rsBackend entry for the AI SDK starter

Integrate CopilotKit (AG-UI)

Use this when you have a CopilotKit React frontend and need to connect it to an awaken agent server via the AG-UI protocol.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • Feature server enabled on the awaken crate
  • Node.js project with @copilotkit/react-core and @copilotkit/react-ui installed
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["server"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
tracing-subscriber = "0.3"

Steps

  1. Build the backend server.
use std::sync::Arc;

use awaken::engine::GenaiExecutor;
use awaken::contract::storage::ThreadRunStore;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use awaken::AgentRuntimeBuilder;
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::server::routes::build_router;

#[tokio::main]
async fn main() {
    tracing_subscriber::fmt().with_target(true).init();

    let agent_spec = AgentSpec::new("copilotkit-agent")
        .with_model("gpt-4o-mini")
        .with_system_prompt(
            "You are a CopilotKit-powered assistant. \
             Update shared state and suggest actions when appropriate.",
        )
        .with_max_rounds(10);

    let runtime = AgentRuntimeBuilder::new()
        .with_provider("openai", Arc::new(GenaiExecutor::new()))
        .with_model(
            "gpt-4o-mini",
            ModelSpec {
                id: "gpt-4o-mini".into(),
                provider: "openai".into(),
                model: "gpt-4o-mini".into(),
            },
        )
        .with_agent_spec(agent_spec)
        .build()
        .expect("failed to build runtime");
    let runtime = Arc::new(runtime);

    let store = Arc::new(InMemoryStore::new());
    let resolver = runtime.resolver_arc();

    let mailbox_store = Arc::new(InMemoryMailboxStore::new());
    let mailbox = Arc::new(Mailbox::new(
        runtime.clone(),
        mailbox_store as Arc<dyn awaken::contract::MailboxStore>,
        format!("copilotkit:{}", std::process::id()),
        MailboxConfig::default(),
    ));

    let state = AppState::new(
        runtime,
        mailbox,
        store as Arc<dyn ThreadRunStore>,
        resolver,
        ServerConfig {
            address: "127.0.0.1:3000".into(),
            ..Default::default()
        },
    );

    let app = build_router().with_state(state);

    let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
        .await
        .expect("failed to bind");
    axum::serve(listener, app).await.expect("server crashed");
}

The server automatically registers AG-UI routes at:

  • POST /v1/ag-ui/run – create a new run and stream AG-UI events
  • POST /v1/ag-ui/threads/:thread_id/runs – start a thread-scoped run
  • POST /v1/ag-ui/agents/:agent_id/runs – start an agent-scoped run
  • POST /v1/ag-ui/threads/:thread_id/interrupt – interrupt a running thread
  • GET /v1/ag-ui/threads/:id/messages – retrieve thread messages
  1. Connect the CopilotKit frontend.

    Install CopilotKit packages:

npm install @copilotkit/react-core @copilotkit/react-ui

Wrap your app with the CopilotKit provider:

import { CopilotKit } from "@copilotkit/react-core";
import { CopilotChat } from "@copilotkit/react-ui";
import "@copilotkit/react-ui/styles.css";

export default function App() {
  return (
    <CopilotKit runtimeUrl="http://localhost:3000/v1/ag-ui">
      <CopilotChat
        labels={{ title: "Agent", initial: "How can I help?" }}
      />
    </CopilotKit>
  );
}
  1. Run both sides.
# Terminal 1: backend
cargo run

# Terminal 2: frontend
npm run dev

Verify

  1. Open the frontend in a browser.
  2. Send a message through the CopilotChat widget.
  3. Confirm streaming text appears in the chat UI.
  4. Check the backend logs for RunStart and RunFinish events.

Common Errors

SymptomCauseFix
CORS error in browserNo CORS middlewareAdd tower-http CORS layer to the axum router
CopilotKit shows “connection failed”Wrong runtimeUrlConfirm it points to http://localhost:3000/v1/ag-ui
Events arrive but UI does not updateAG-UI event format mismatchEnsure CopilotKit version is compatible with AG-UI protocol
404 on /v1/ag-ui/runMissing server featureEnable features = ["server"] in Cargo.toml
  • examples/copilotkit-starter/agent/src/main.rs

Key Files

PathPurpose
crates/awaken-server/src/protocols/ag_ui/http.rsAG-UI route handlers
crates/awaken-server/src/protocols/ag_ui/encoder.rsAG-UI SSE event encoder
crates/awaken-server/src/routes.rsUnified router builder
crates/awaken-server/src/app.rsAppState and ServerConfig
examples/copilotkit-starter/agent/src/main.rsBackend entry for the CopilotKit starter

State & Storage

This path is for teams moving beyond stateless demos.

Use this section to decide

  • where thread and run data should live
  • how state is keyed and merged
  • how much context should reach the model each turn
  1. Use File Store or Use Postgres Store to choose a persistence backend.
  2. State Keys and Thread Model to understand state layout and lifecycle.
  3. Optimize Context Window when context size starts to matter.

Use File Store

Use this when you need file-based persistence for threads, runs, and messages without an external database.

Prerequisites

  • awaken-stores crate with the file feature enabled

Steps

  1. Add the dependency.
[dependencies]
awaken-stores = { version = "...", features = ["file"] }

Or, if using the awaken facade crate (which re-exports awaken-stores), add awaken-stores directly for the feature flag:

[dependencies]
awaken = { package = "awaken-agent", version = "..." }
awaken-stores = { version = "...", features = ["file"] }
  1. Create a FileStore.
use std::sync::Arc;
use awaken::stores::FileStore;

let store = Arc::new(FileStore::new("./data"));

The directory is created automatically on first write. The layout is:

./data/
  threads/<thread_id>.json
  messages/<thread_id>.json
  runs/<run_id>.json
  1. Wire it into the runtime.
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_thread_run_store(store)
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;
  1. Use an absolute path for production.
use std::path::PathBuf;

let data_dir = PathBuf::from("/var/lib/myapp/awaken");
let store = Arc::new(FileStore::new(data_dir));

Verify

Run the agent, then inspect the data directory. You should see JSON files under threads/, messages/, and runs/ corresponding to the thread and run IDs used.

Common Errors

ErrorCauseFix
StorageError::IoPermission denied on the data directoryEnsure the process has read/write access to the path
StorageError::Io with empty IDThread or run ID contains invalid characters (/, \, ..)Use simple alphanumeric or UUID-style IDs
Missing data after restartUsing a relative path that resolved differentlyUse an absolute path

crates/awaken-stores/src/file.rsFileStore implementation with filesystem layout details.

Key Files

  • crates/awaken-stores/Cargo.toml – feature flag definition
  • crates/awaken-stores/src/file.rsFileStore
  • crates/awaken-stores/src/lib.rs – conditional re-export

Use Postgres Store

Use this when you need durable, multi-instance persistence backed by PostgreSQL.

Prerequisites

  • awaken-stores crate with the postgres feature enabled
  • A running PostgreSQL instance
  • sqlx runtime dependencies (tokio)

Steps

  1. Add the dependency.
[dependencies]
awaken-stores = { version = "...", features = ["postgres"] }
  1. Create a connection pool.
use sqlx::PgPool;

let pool = PgPool::connect("postgres://user:pass@localhost:5432/mydb").await?;
  1. Create a PostgresStore.
use std::sync::Arc;
use awaken::stores::PostgresStore;

let store = Arc::new(PostgresStore::new(pool));

This uses default table names: awaken_threads, awaken_runs, awaken_messages.

  1. Use a custom table prefix.
let store = Arc::new(PostgresStore::with_prefix(pool, "myapp"));

This creates tables named myapp_threads, myapp_runs, myapp_messages.

  1. Wire it into the runtime.
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_thread_run_store(store)
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;
  1. Schema creation.

    Tables are auto-created on first access via ensure_schema(). Each table uses:

  • id TEXT PRIMARY KEY
  • data JSONB NOT NULL
  • updated_at TIMESTAMPTZ NOT NULL DEFAULT now()

No manual migration is required for initial setup.

Verify

After running the agent, query the database:

SELECT id, updated_at FROM awaken_threads;
SELECT id, updated_at FROM awaken_runs;

You should see rows corresponding to the threads and runs created during execution.

Common Errors

ErrorCauseFix
sqlx::Error connection refusedPostgreSQL is not running or the connection string is wrongVerify the DATABASE_URL and that the database is accepting connections
StorageError on first writeInsufficient database privilegesGrant CREATE TABLE and INSERT permissions to the database user
Table name collisionAnother application uses the same default table namesUse PostgresStore::with_prefix to namespace tables

crates/awaken-stores/src/postgres.rsPostgresStore implementation with schema auto-creation.

Key Files

  • crates/awaken-stores/Cargo.tomlpostgres feature flag and sqlx dependency
  • crates/awaken-stores/src/postgres.rsPostgresStore
  • crates/awaken-stores/src/lib.rs – conditional re-export

Optimize the Context Window

Use this when you need to control how the runtime manages conversation history to stay within model token limits.

Prerequisites

  • awaken crate added to Cargo.toml
  • An agent configured with AgentSpec

ContextWindowPolicy

Every agent has a ContextWindowPolicy that controls how conversation history is managed. Set it on your AgentSpec:

use awaken::ContextWindowPolicy;

let policy = ContextWindowPolicy {
    max_context_tokens: 200_000,
    max_output_tokens: 16_384,
    min_recent_messages: 10,
    enable_prompt_cache: true,
    autocompact_threshold: Some(100_000),
    compaction_mode: ContextCompactionMode::KeepRecentRawSuffix,
    compaction_raw_suffix_messages: 2,
};

Fields

FieldTypeDefaultDescription
max_context_tokensusize200_000Model’s total context window size in tokens
max_output_tokensusize16_384Tokens reserved for model output
min_recent_messagesusize10Minimum number of recent messages to always preserve, even if over budget
enable_prompt_cachebooltrueWhether to enable prompt caching
autocompact_thresholdOption<usize>NoneToken count that triggers auto-compaction. None disables auto-compaction
compaction_modeContextCompactionModeKeepRecentRawSuffixStrategy used when auto-compaction fires
compaction_raw_suffix_messagesusize2Number of recent raw messages to preserve in suffix compaction mode

Truncation

When the conversation exceeds the available token budget, the runtime automatically drops the oldest messages to fit. The budget is calculated as:

available = max_context_tokens - max_output_tokens - tool_schema_tokens

What truncation preserves

  • System messages are never truncated. All system messages at the start of the history survive regardless of budget.
  • Recent messages – at least min_recent_messages history messages are kept, even if they exceed the budget.
  • Tool call/result pairs – the split point is adjusted so that an assistant message with tool calls is never separated from its corresponding tool result messages.
  • Dangling tool calls – after truncation, any orphaned tool calls (whose results were dropped) are patched to prevent invalid message sequences.

Artifact compaction

Before truncation runs, oversized tool results are compacted automatically. A tool result whose text exceeds ARTIFACT_COMPACT_THRESHOLD_TOKENS (2048 tokens, estimated at ~8192 characters) is truncated to a preview of at most 1600 characters or 24 lines, whichever is shorter. The preview includes a compaction indicator showing the original size.

Non-tool messages (system, user, assistant) are never subject to artifact compaction.

Compaction

Compaction summarizes older conversation history into a condensed summary message, reducing token usage while preserving context. Unlike truncation (which drops messages), compaction replaces them with a summary.

Enabling auto-compaction

Set autocompact_threshold to trigger compaction when total message tokens exceed that value:

let policy = ContextWindowPolicy {
    autocompact_threshold: Some(100_000),
    compaction_mode: ContextCompactionMode::KeepRecentRawSuffix,
    compaction_raw_suffix_messages: 4,
    ..Default::default()
};

ContextCompactionMode

Two strategies are available:

  • KeepRecentRawSuffix (default) – keeps the most recent compaction_raw_suffix_messages messages as raw history. Everything before the compaction boundary is summarized.
  • CompactToSafeFrontier – compacts all messages up to the safe frontier (the latest point where all tool call/result pairs are complete).

The compaction boundary is chosen so that no tool call is separated from its result. The boundary finder walks the message history and only places boundaries where all open tool calls have been resolved.

CompactionConfig

The compaction subsystem is configured through CompactionConfig, stored in the agent spec’s sections["compaction"] and read via CompactionConfigKey:

use awaken::CompactionConfig;

let config = CompactionConfig {
    summarizer_system_prompt: "You are a conversation summarizer. \
        Preserve all key facts, decisions, tool results, and action items. \
        Be concise but complete.".into(),
    summarizer_user_prompt: "Summarize the following conversation:\n\n{messages}".into(),
    summary_max_tokens: Some(1024),
    summary_model: Some("claude-3-haiku".into()),
    min_savings_ratio: 0.3,
};
FieldTypeDefaultDescription
summarizer_system_promptStringConversation summarizer promptSystem prompt for the summarizer LLM call
summarizer_user_promptString"Summarize...\n\n{messages}"User prompt template; {messages} is replaced with the conversation transcript
summary_max_tokensOption<u32>NoneMaximum tokens for the summary response
summary_modelOption<String>NoneModel for summarization (defaults to the agent’s model)
min_savings_ratiof640.3Minimum token savings ratio (0.0-1.0) to accept a compaction

The compaction pass only runs when the expected savings ratio exceeds min_savings_ratio. A minimum gain of 1024 tokens (MIN_COMPACTION_GAIN_TOKENS) is also required to justify the summarization LLM call.

DefaultSummarizer

The built-in DefaultSummarizer reads prompts from CompactionConfig and supports cumulative summarization. When a previous summary exists, it asks the LLM to update the existing summary with new conversation rather than re-summarizing everything from scratch.

The transcript renderer filters out Visibility::Internal messages before sending to the summarizer, since system-injected context is re-injected each turn and should not be included in summaries.

Summary storage

Compaction summaries are stored as <conversation-summary> tagged internal system messages. On load, trim_to_compaction_boundary drops all messages before the latest summary message, so already-summarized history is never re-loaded into the context window.

Compaction boundaries are tracked durably via CompactionState, recording the summary text, pre/post token counts, and timestamp for each compaction event.

Truncation recovery

When the LLM stops due to MaxTokens with incomplete tool calls (argument JSON was truncated mid-generation), the runtime can automatically retry by injecting a continuation prompt asking the model to break its work into smaller pieces and continue. The retry count is tracked by TruncationState and bounded by a configurable maximum.

Key Files

  • crates/awaken-contract/src/contract/inference.rsContextWindowPolicy, ContextCompactionMode
  • crates/awaken-runtime/src/context/transform/mod.rsContextTransform (truncation)
  • crates/awaken-runtime/src/context/transform/compaction.rs – artifact compaction
  • crates/awaken-runtime/src/context/compaction.rs – boundary finding, load-time trimming
  • crates/awaken-runtime/src/context/summarizer.rsContextSummarizer, DefaultSummarizer
  • crates/awaken-runtime/src/context/plugin.rsCompactionPlugin, CompactionConfig, CompactionState
  • crates/awaken-runtime/src/context/truncation.rsTruncationState, continuation prompts

State Keys

The state system provides typed, scoped, persistent key-value storage for agent runs. Plugins and tools define state keys at compile time; the runtime manages snapshots, persistence, and parallel merge semantics.

StateKey trait

Every state slot is identified by a type implementing StateKey.

pub trait StateKey: 'static + Send + Sync {
    /// Unique string identifier for serialization.
    const KEY: &'static str;

    /// Parallel merge strategy. Default: `Exclusive`.
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;

    /// Lifetime scope. Default: `Run`.
    const SCOPE: KeyScope = KeyScope::Run;

    /// The stored value type.
    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;

    /// The update command type.
    type Update: Send + 'static;

    /// Apply an update to the current value.
    fn apply(value: &mut Self::Value, update: Self::Update);

    /// Serialize value to JSON. Default uses serde_json.
    fn encode(value: &Self::Value) -> Result<JsonValue, StateError>;

    /// Deserialize value from JSON. Default uses serde_json.
    fn decode(value: JsonValue) -> Result<Self::Value, StateError>;
}

Crate path: awaken::state::StateKey (re-exported at awaken::StateKey)

Example

struct Counter;

impl StateKey for Counter {
    const KEY: &'static str = "counter";
    type Value = usize;
    type Update = usize;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

KeyScope

Controls when a key’s value is cleared relative to run boundaries.

pub enum KeyScope {
    /// Cleared at run start (default).
    Run,
    /// Persists across runs on the same thread.
    Thread,
}

MergeStrategy

Determines how concurrent updates to the same key are handled when merging MutationBatches from parallel tool execution.

pub enum MergeStrategy {
    /// Concurrent writes to this key conflict. Parallel batches that both
    /// touch this key cannot be merged.
    Exclusive,
    /// Updates are commutative -- they can be applied in any order. Parallel
    /// batches that both touch this key will have their ops concatenated.
    Commutative,
}

StateMap

A type-erased container for state values. Backed by TypedMap from the typedmap crate.

pub struct StateMap { /* ... */ }

Methods

/// Check if a key is present.
fn contains<K: StateKey>(&self) -> bool

/// Get a reference to the value.
fn get<K: StateKey>(&self) -> Option<&K::Value>

/// Get a mutable reference to the value.
fn get_mut<K: StateKey>(&mut self) -> Option<&mut K::Value>

/// Insert a value, replacing any existing one.
fn insert<K: StateKey>(&mut self, value: K::Value)

/// Remove and return the value.
fn remove<K: StateKey>(&mut self) -> Option<K::Value>

/// Get a mutable reference, inserting `Default::default()` if absent.
fn get_or_insert_default<K: StateKey>(&mut self) -> &mut K::Value

StateMap implements Clone and Default.

Snapshot

An immutable, versioned view of the state map. Passed to tools via ToolCallContext.

pub struct Snapshot {
    pub revision: u64,
    pub ext: Arc<StateMap>,
}

Methods

fn new(revision: u64, ext: Arc<StateMap>) -> Self
fn revision(&self) -> u64
fn get<K: StateKey>(&self) -> Option<&K::Value>
fn ext(&self) -> &StateMap

Type alias: Snapshot is not a type alias; it is a concrete struct that wraps Arc<StateMap>.

StateKeyOptions

Declarative options used when registering a state key with the runtime.

pub struct StateKeyOptions {
    /// Whether the key is persisted to the store. Default: `true`.
    pub persistent: bool,
    /// Whether the key survives plugin uninstall. Default: `false`.
    pub retain_on_uninstall: bool,
    /// Lifetime scope. Default: `KeyScope::Run`.
    pub scope: KeyScope,
}

PersistedState

Serialized form of the state map used by storage backends.

pub struct PersistedState {
    pub revision: u64,
    pub extensions: HashMap<String, JsonValue>,
}

Shared State (ProfileKey + StateScope)

For cross-boundary persistent state with dynamic scoping. Shared state uses ProfileKey – the same trait used for profile data – combined with a key: &str parameter for the runtime key dimension. Unlike StateKey, shared state is accessed asynchronously through ProfileAccess and does not participate in the snapshot/mutation workflow.

pub trait ProfileKey: 'static + Send + Sync {
    /// Namespace identifier (used as the storage namespace).
    const KEY: &'static str;

    /// The value type stored under this key.
    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;

    /// Serialize value to JSON.
    fn encode(value: &Self::Value) -> Result<JsonValue, StateError>;

    /// Deserialize value from JSON.
    fn decode(value: JsonValue) -> Result<Self::Value, StateError>;
}

The two dimensions for both shared and profile state are:

  • Namespace (ProfileKey::KEY) – compile-time, binds to a Value type
  • Key (&str parameter) – runtime, identifies which instance

For shared state, the key is typically a StateScope string; for profile state, it is an agent name or "system".

Crate path: awaken_contract::ProfileKey (re-exported from awaken_contract::contract::profile_store)

StateScope

pub struct StateScope(String);

Optional convenience builder for common key string patterns. Constructors:

MethodProduced StringUse Case
StateScope::global()"global"System-wide shared state
StateScope::parent_thread(id)"parent_thread::{id}"Parent-child agent sharing
StateScope::agent_type(name)"agent_type::{name}"Per-agent-type sharing
StateScope::thread(id)"thread::{id}"Thread-local persistent state
StateScope::new(s)"{s}"Arbitrary grouping

Call .as_str() to get the key string. Users can also pass any raw &str directly.

ProfileAccess Methods

ProfileAccess methods take key: &str for the runtime key dimension. The same methods serve both shared and profile state:

impl ProfileAccess {
    async fn read<K: ProfileKey>(&self, key: &str) -> Result<K::Value, StorageError>;
    async fn write<K: ProfileKey>(&self, key: &str, value: &K::Value) -> Result<(), StorageError>;
    async fn delete<K: ProfileKey>(&self, key: &str) -> Result<(), StorageError>;
}

// Shared state usage:
let scope = StateScope::global();
let value = access.read::<MyKey>(scope.as_str()).await?;
access.write::<MyKey>(scope.as_str(), &value).await?;

// Profile state usage:
let locale = access.read::<Locale>("alice").await?;
access.write::<Locale>("system", &"en-US".into()).await?;

Registration

impl PluginRegistrar {
    /// Register a profile key (used for both profile state and shared state).
    pub fn register_profile_key<K: ProfileKey>(&mut self) -> Result<(), StateError>;
}

Thread Model

Threads represent persistent conversations. A thread holds metadata only; messages are stored and loaded separately through the ThreadStore trait.

Thread

pub struct Thread {
    /// Unique thread identifier (UUID v7).
    pub id: String,
    /// Thread metadata.
    pub metadata: ThreadMetadata,
}

Crate path: awaken::contract::thread::Thread (re-exported from awaken-contract)

Constructors

/// Create with a generated UUID v7 identifier.
fn new() -> Self

/// Create with a specific identifier.
fn with_id(id: impl Into<String>) -> Self

Builder methods

fn with_title(self, title: impl Into<String>) -> Self

Thread implements Default (delegates to Thread::new()), Clone, Serialize, and Deserialize.

ThreadMetadata

pub struct ThreadMetadata {
    /// Creation timestamp (unix millis).
    pub created_at: Option<u64>,
    /// Last update timestamp (unix millis).
    pub updated_at: Option<u64>,
    /// Optional thread title.
    pub title: Option<String>,
    /// Custom metadata key-value pairs.
    pub custom: HashMap<String, Value>,
}

All Option fields are omitted from JSON when None. The custom map is omitted when empty.

ThreadMetadata implements Default, Clone, Serialize, and Deserialize.

Storage

Messages are not stored inside the Thread struct. They are managed through the ThreadStore trait:

#[async_trait]
pub trait ThreadStore: Send + Sync {
    async fn load_thread(&self, thread_id: &str) -> Result<Option<Thread>, StorageError>;
    async fn save_thread(&self, thread: &Thread) -> Result<(), StorageError>;
    async fn delete_thread(&self, thread_id: &str) -> Result<(), StorageError>;
    async fn list_threads(&self, offset: usize, limit: usize) -> Result<Vec<String>, StorageError>;
    async fn load_messages(&self, thread_id: &str) -> Result<Option<Vec<Message>>, StorageError>;
    async fn save_messages(&self, thread_id: &str, messages: &[Message]) -> Result<(), StorageError>;
    async fn delete_messages(&self, thread_id: &str) -> Result<(), StorageError>;
    async fn update_thread_metadata(&self, id: &str, metadata: ThreadMetadata) -> Result<(), StorageError>;
}

ThreadRunStore

Extends ThreadStore + RunStore with an atomic checkpoint operation.

#[async_trait]
pub trait ThreadRunStore: ThreadStore + RunStore + Send + Sync {
    /// Persist thread messages and run record atomically.
    async fn checkpoint(
        &self,
        thread_id: &str,
        messages: &[Message],
        run: &RunRecord,
    ) -> Result<(), StorageError>;
}

RunStore

Run record persistence for tracking run history and enabling resume.

#[async_trait]
pub trait RunStore: Send + Sync {
    async fn create_run(&self, record: &RunRecord) -> Result<(), StorageError>;
    async fn load_run(&self, run_id: &str) -> Result<Option<RunRecord>, StorageError>;
    async fn latest_run(&self, thread_id: &str) -> Result<Option<RunRecord>, StorageError>;
    async fn list_runs(&self, query: &RunQuery) -> Result<RunPage, StorageError>;
}

RunRecord

pub struct RunRecord {
    pub run_id: String,
    pub thread_id: String,
    pub agent_id: String,
    pub parent_run_id: Option<String>,
    pub status: RunStatus,
    pub termination_code: Option<String>,
    pub created_at: u64,        // unix seconds
    pub updated_at: u64,        // unix seconds
    pub steps: usize,
    pub input_tokens: u64,
    pub output_tokens: u64,
    pub state: Option<PersistedState>,
}

Operate

This path is for hardening an agent service once the happy path already works.

  1. Enable Observability to make runs, tools, and providers visible.
  2. Enable Tool Permission HITL to add approval control over tool execution.
  3. Configure Stop Policies to keep agent loops bounded and predictable.
  4. Report Tool Progress and Testing Strategy to improve operator visibility and confidence.

Keep nearby

Enable Tool Permission HITL

Use this when you need to control which tools an agent can invoke, with human-in-the-loop approval for sensitive operations.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • Feature permission enabled on the awaken crate (enabled by default)
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["permission"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"

Steps

  1. Register the permission plugin.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_permission::PermissionPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let agent_spec = AgentSpec::new("my-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("permission");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("permission", Arc::new(PermissionPlugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

The plugin registers a BeforeToolExecute phase hook that evaluates permission rules before every tool call.

  1. Define permission rules inline.
use awaken::ext_permission::{PermissionRulesConfig, PermissionRuleEntry, ToolPermissionBehavior};

let config = PermissionRulesConfig {
    default_behavior: ToolPermissionBehavior::Ask,
    rules: vec![
        PermissionRuleEntry {
            tool: "read_file".into(),
            behavior: ToolPermissionBehavior::Allow,
            scope: Default::default(),
        },
        PermissionRuleEntry {
            tool: "file_*".into(),
            behavior: ToolPermissionBehavior::Ask,
            scope: Default::default(),
        },
        PermissionRuleEntry {
            tool: "delete_*".into(),
            behavior: ToolPermissionBehavior::Deny,
            scope: Default::default(),
        },
    ],
};

let ruleset = config.into_ruleset().expect("invalid rules");
  1. Load rules from a YAML file (alternative).

    Create permissions.yaml:

default_behavior: ask
rules:
  - tool: "read_file"
    behavior: allow
  - tool: "Bash(npm *)"
    behavior: allow
  - tool: "file_*"
    behavior: ask
  - tool: "delete_*"
    behavior: deny
  - tool: "mcp__*"
    behavior: ask

Load it in code:

use awaken::ext_permission::PermissionRulesConfig;

let config = PermissionRulesConfig::from_file("permissions.yaml")
    .expect("failed to load permissions");
let ruleset = config.into_ruleset().expect("invalid rules");
  1. Configure via agent spec.

    Permission rules can also be embedded in the agent spec using the permission plugin config key:

use awaken::registry_spec::AgentSpec;

let agent_spec = AgentSpec::new("my-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("permission");
  1. Understand rule evaluation.

    Rules are evaluated with firewall-like priority:

  2. Deny – highest priority, blocks the tool call immediately

  3. Allow – permits the tool call without user interaction

  4. Ask – suspends the tool call and waits for human approval

The pattern syntax supports:

PatternMatches
read_fileExact tool name
file_*Glob on tool name
mcp__github__*Glob for MCP-prefixed tools
Bash(npm *)Tool name with primary argument glob
Edit(file_path ~ "src/**")Named field glob
Bash(command =~ "(?i)rm")Named field regex
/mcp__(gh|gl)__.*/Regex tool name

Verify

  1. Register a tool that matches a deny rule and attempt to invoke it.
  2. The tool call should be blocked and a ToolInterceptAction event emitted.
  3. Register a tool matching an ask rule. The run should suspend, waiting for human approval via the mailbox.
  4. Send approval through the mailbox endpoint and confirm the run resumes.

Common Errors

SymptomCauseFix
All tools blockeddefault_behavior: deny with no allow rulesAdd explicit allow rules for safe tools
Rules not evaluatedPlugin not registeredRegister PermissionPlugin and add "permission" to plugin_ids in agent spec
Invalid pattern errorMalformed glob or regexCheck syntax against the pattern table above
Ask rule never resolvesNo mailbox consumerWire up a frontend or API client to respond to mailbox items
  • crates/awaken-ext-permission/tests/

Key Files

PathPurpose
crates/awaken-ext-permission/src/lib.rsModule root and public re-exports
crates/awaken-ext-permission/src/config.rsPermissionRulesConfig and YAML/JSON loading
crates/awaken-ext-permission/src/rules.rsPattern syntax, ToolPermissionBehavior, rule evaluation
crates/awaken-ext-permission/src/plugin/plugin.rsPermissionPlugin registration
crates/awaken-ext-permission/src/plugin/checker.rsPermissionInterceptHook (BeforeToolExecute)
crates/awaken-tool-pattern/Shared glob/regex pattern matching library

Configure Stop Policies

Use this when you need to control when an agent run terminates based on round count, token usage, elapsed time, or error frequency.

Prerequisites

  • awaken crate added to Cargo.toml
  • Familiarity with Plugin and AgentRuntimeBuilder

Overview

Stop policies evaluate after each inference step and decide whether the run should continue or terminate. The system provides four built-in policies and a trait for custom implementations.

Built-in policies:

PolicyTriggers when
MaxRoundsPolicyStep count exceeds max rounds
TokenBudgetPolicyTotal tokens (input + output) exceed max_total
TimeoutPolicyElapsed wall time exceeds max_ms milliseconds
ConsecutiveErrorsPolicyConsecutive inference errors reach max

Steps

  1. Create policies programmatically.
use std::sync::Arc;
use awaken::policies::{
    MaxRoundsPolicy, TokenBudgetPolicy, TimeoutPolicy,
    ConsecutiveErrorsPolicy, StopPolicy,
};

let policies: Vec<Arc<dyn StopPolicy>> = vec![
    Arc::new(MaxRoundsPolicy::new(25)),
    Arc::new(TokenBudgetPolicy::new(100_000)),
    Arc::new(TimeoutPolicy::new(300_000)), // 5 minutes in ms
    Arc::new(ConsecutiveErrorsPolicy::new(3)),
];
  1. Register a StopConditionPlugin with the runtime builder.
use awaken::policies::StopConditionPlugin;
use awaken::AgentRuntimeBuilder;

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("stop-condition", Arc::new(StopConditionPlugin::new(policies)))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;

For the common case of limiting rounds only, use the convenience wrapper:

use awaken::policies::MaxRoundsPlugin;

let runtime = AgentRuntimeBuilder::new()
    .with_plugin("stop-condition:max-rounds", Arc::new(MaxRoundsPlugin::new(10)))
    .with_agent_spec(spec)
    .with_provider("anthropic", Arc::new(provider))
    .build()?;
  1. Use declarative StopConditionSpec values.

The policies_from_specs function converts declarative specs into policy instances. This is useful when loading configuration from JSON or YAML.

use awaken_contract::contract::lifecycle::StopConditionSpec;
use awaken::policies::{policies_from_specs, StopConditionPlugin};

let specs = vec![
    StopConditionSpec::MaxRounds { rounds: 10 },
    StopConditionSpec::Timeout { seconds: 300 },
    StopConditionSpec::TokenBudget { max_total: 100_000 },
    StopConditionSpec::ConsecutiveErrors { max: 3 },
];

let policies = policies_from_specs(&specs);
let plugin = StopConditionPlugin::new(policies);

The full set of StopConditionSpec variants:

pub enum StopConditionSpec {
    MaxRounds { rounds: usize },
    Timeout { seconds: u64 },
    TokenBudget { max_total: usize },
    ConsecutiveErrors { max: usize },
    StopOnTool { tool_name: String },      // not yet implemented
    ContentMatch { pattern: String },       // not yet implemented
    LoopDetection { window: usize },        // not yet implemented
}

StopOnTool, ContentMatch, and LoopDetection are defined in the contract but not yet backed by policy implementations. policies_from_specs silently skips unimplemented variants.

The StopPolicy Trait

Implement StopPolicy to create custom stop conditions:

use awaken::policies::{StopPolicy, StopDecision, StopPolicyStats};

pub struct MyCustomPolicy {
    pub threshold: u64,
}

impl StopPolicy for MyCustomPolicy {
    fn id(&self) -> &str {
        "my_custom"
    }

    fn evaluate(&self, stats: &StopPolicyStats) -> StopDecision {
        if stats.total_output_tokens > self.threshold {
            StopDecision::Stop {
                code: "my_custom".into(),
                detail: format!("output tokens {} exceeded {}", stats.total_output_tokens, self.threshold),
            }
        } else {
            StopDecision::Continue
        }
    }
}

The trait requires Send + Sync + 'static. Evaluation must be synchronous – it is pure computation on the provided stats.

StopPolicyStats

Every policy receives a StopPolicyStats snapshot with fields populated by the internal StopConditionHook:

FieldTypeDescription
step_countu32Number of inference steps completed so far
total_input_tokensu64Cumulative prompt tokens across all steps
total_output_tokensu64Cumulative completion tokens across all steps
elapsed_msu64Wall time since the first step, in milliseconds
consecutive_errorsu32Current streak of consecutive inference errors (resets on success)
last_tool_namesVec<String>Tool names called in the most recent inference response
last_response_textStringText content of the most recent inference response

StopDecision

pub enum StopDecision {
    Continue,
    Stop { code: String, detail: String },
}

When any policy returns StopDecision::Stop, the hook converts it to TerminationReason::Stopped with the given code and detail, then updates the run lifecycle to Done. The agent loop exits after the current step. Policies are evaluated in order; the first Stop wins.

How Stop Policies Interact with the Agent Loop

  1. The StopConditionPlugin registers a PhaseHook on Phase::AfterInference.
  2. After each LLM inference, the hook increments step_count, accumulates token usage, and tracks consecutive errors in StopConditionStatsState.
  3. The hook builds a StopPolicyStats snapshot and calls evaluate on each registered policy.
  4. If any policy returns Stop, the hook emits a RunLifecycleUpdate::Done state command with the stop code, which terminates the run.
  5. If all policies return Continue, the agent loop proceeds to the next step.

A policy with max or max_total set to 0 is treated as disabled and always returns Continue.

Common Errors

ErrorCauseFix
Run never stopsNo stop policy registered and LLM keeps calling toolsRegister at least MaxRoundsPolicy or MaxRoundsPlugin
StateError::KeyAlreadyRegisteredBoth StopConditionPlugin and MaxRoundsPlugin registeredUse only one; they share the same state key
Timeout fires too earlyTimeoutPolicy takes milliseconds, StopConditionSpec::Timeout takes secondsWhen using TimeoutPolicy::new() directly, pass milliseconds

Key Files

  • crates/awaken-runtime/src/policies/mod.rs – module root and public exports
  • crates/awaken-runtime/src/policies/policy.rsStopPolicy trait, built-in policies, policies_from_specs
  • crates/awaken-runtime/src/policies/plugin.rsStopConditionPlugin and MaxRoundsPlugin
  • crates/awaken-runtime/src/policies/state.rsStopConditionStatsState and its state key
  • crates/awaken-runtime/src/policies/hook.rs – internal StopConditionHook that drives evaluation
  • crates/awaken-contract/src/contract/lifecycle.rsStopConditionSpec enum

Enable Observability

Use this when you need to trace LLM inference calls and tool executions with OpenTelemetry-compatible telemetry.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • Feature observability enabled on the awaken crate (enabled by default)
  • For OTel export: feature otel enabled on awaken-ext-observability, plus a configured OTel collector
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["observability"] }
tokio = { version = "1", features = ["full"] }

Steps

  1. Register with the in-memory sink (development).
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, InMemorySink};
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};

let sink = InMemorySink::new();
let obs_plugin = ObservabilityPlugin::new(sink.clone());
let agent_spec = AgentSpec::new("observed-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("observability");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");

After a run completes, inspect collected metrics:

let metrics = sink.metrics();
println!("inferences: {}", metrics.inferences.len());
println!("tool calls: {}", metrics.tools.len());
println!("total input tokens: {}", metrics.total_input_tokens());
println!("total output tokens: {}", metrics.total_output_tokens());
println!("tool failures: {}", metrics.tool_failures());
println!("session duration: {}ms", metrics.session_duration_ms);

for stat in metrics.stats_by_model() {
    println!("{}: {} calls, {} in / {} out tokens",
        stat.model, stat.inference_count, stat.input_tokens, stat.output_tokens);
}

for stat in metrics.stats_by_tool() {
    println!("{}: {} calls, {} failures",
        stat.name, stat.call_count, stat.failure_count);
}
  1. Register with the OTel sink (production).
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, OtelMetricsSink};
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
use opentelemetry_sdk::trace::SdkTracerProvider;

let provider = SdkTracerProvider::builder()
    // configure your exporter (OTLP, Jaeger, etc.)
    .build();
let tracer = provider.tracer("awaken");

let obs_plugin = ObservabilityPlugin::new(OtelMetricsSink::new(tracer));
let agent_spec = AgentSpec::new("observed-agent")
    .with_model("gpt-4o-mini")
    .with_system_prompt("You are a helpful assistant.")
    .with_hook_filter("observability");

let runtime = AgentRuntimeBuilder::new()
    .with_provider("openai", Arc::new(GenaiExecutor::new()))
    .with_model(
        "gpt-4o-mini",
        ModelSpec {
            id: "gpt-4o-mini".into(),
            provider: "openai".into(),
            model: "gpt-4o-mini".into(),
        },
    )
    .with_agent_spec(agent_spec)
    .with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
    .build()
    .expect("failed to build runtime");
  1. Implement a custom sink (optional).
use awaken::ext_observability::{MetricsSink, GenAISpan, ToolSpan, AgentMetrics};

struct MySink;

impl MetricsSink for MySink {
    fn on_inference(&self, span: &GenAISpan) {
        // forward to your metrics system
    }

    fn on_tool(&self, span: &ToolSpan) {
        // forward to your metrics system
    }

    fn on_run_end(&self, metrics: &AgentMetrics) {
        // emit summary metrics
    }
}
  1. Captured telemetry.

    The plugin hooks into the following phases:

PhaseData Captured
RunStartSession start timestamp
BeforeInferenceInference start timestamp, model, provider
AfterInferenceToken usage, finish reasons, duration, cache tokens
BeforeToolExecuteTool call start timestamp
AfterToolExecuteTool duration, error status
RunEndSession duration

OTel spans follow GenAI semantic conventions with attributes such as gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens.

Verify

  1. Run an agent with the InMemorySink.
  2. After run() completes, call sink.metrics().
  3. Confirm inferences is non-empty and token counts are populated.
  4. For OTel, check your collector or Jaeger UI for spans named with the awaken tracer.

Common Errors

SymptomCauseFix
Metrics are all zeroPlugin not registeredRegister ObservabilityPlugin with the runtime builder
OtelMetricsSink not foundMissing otel featureEnable the otel feature on awaken-ext-observability
No spans in collectorExporter not configuredVerify SdkTracerProvider has an exporter and is not dropped
Token counts missingLLM provider does not report usageCheck that your LlmExecutor returns TokenUsage in LLMResponse
  • crates/awaken-ext-observability/tests/

Key Files

PathPurpose
crates/awaken-ext-observability/src/lib.rsModule root and public re-exports
crates/awaken-ext-observability/src/plugin/plugin.rsObservabilityPlugin registration
crates/awaken-ext-observability/src/plugin/hooks.rsPhase hooks for each telemetry point
crates/awaken-ext-observability/src/metrics.rsAgentMetrics, GenAISpan, ToolSpan types
crates/awaken-ext-observability/src/sink.rsMetricsSink trait and InMemorySink
crates/awaken-ext-observability/src/otel.rsOtelMetricsSink with GenAI semantic conventions

Report Tool Progress

Use this when you need to stream progress updates or activity snapshots from a tool back to the frontend during execution.

Prerequisites

  • A Tool implementation with access to ToolCallContext
  • awaken crate added to Cargo.toml

Steps

  1. Report structured progress with report_progress.

    Call report_progress inside your Tool::execute method to emit a ToolCallProgressState snapshot. The runtime wraps it in an AgentEvent::ActivitySnapshot with activity_type = "tool-call-progress".

use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use awaken::contract::progress::ProgressStatus;
use async_trait::async_trait;
use serde_json::{Value, json};

struct IndexTool;

#[async_trait]
impl Tool for IndexTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("index_docs", "Index Docs", "Index a document set")
    }

    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        ctx.report_progress(ProgressStatus::Running, Some("Starting indexing"), None).await;

        for i in 0..10 {
            // ... do work ...
            let fraction = (i + 1) as f64 / 10.0;
            ctx.report_progress(
                ProgressStatus::Running,
                Some(&format!("Indexed batch {}/10", i + 1)),
                Some(fraction),
            ).await;
        }

        ctx.report_progress(ProgressStatus::Done, Some("Indexing complete"), Some(1.0)).await;
        Ok(ToolResult::success("index_docs", json!({"indexed": 10})).into())
    }
}

The progress parameter is a normalized f64 between 0.0 and 1.0. Pass None when progress is indeterminate.

  1. Report custom activity snapshots with report_activity.

    Use report_activity for full-state activity updates that are not structured progress. The runtime emits an AgentEvent::ActivitySnapshot with replace: Some(true).

ctx.report_activity("code-generation", "fn hello() {\n    println!(\"hi\");\n}").await;

The activity_type string identifies the kind of activity. Frontends use it to choose how to render the content.

  1. Report incremental activity deltas with report_activity_delta.

    Use report_activity_delta to send a JSON patch instead of replacing the full content. The runtime emits an AgentEvent::ActivityDelta. If you pass a single JSON value, it is wrapped in an array automatically.

use serde_json::json;

ctx.report_activity_delta(
    "code-generation",
    json!([
        { "op": "add", "path": "/line", "value": "    println!(\"world\");" }
    ]),
).await;
  1. Use ProgressStatus variants to reflect the tool call lifecycle.
VariantMeaning
PendingTool call is queued but has not started
RunningTool call is actively executing
DoneTool call completed successfully
FailedTool call encountered an error
CancelledTool call was cancelled before completion

ProgressStatus serializes to snake_case strings ("pending", "running", "done", "failed", "cancelled").

How it works

report_progress builds a ToolCallProgressState and emits it as an AgentEvent::ActivitySnapshot:

pub struct ToolCallProgressState {
    pub schema: String,           // "tool-call-progress.v1"
    pub node_id: String,          // set to call_id
    pub call_id: String,
    pub tool_name: String,
    pub status: ProgressStatus,
    pub progress: Option<f64>,    // 0.0 - 1.0, None if indeterminate
    pub loaded: Option<u64>,      // absolute loaded count
    pub total: Option<u64>,       // absolute total count
    pub message: Option<String>,
    pub parent_node_id: Option<String>,
    pub parent_call_id: Option<String>,
}

The activity type is the constant TOOL_CALL_PROGRESS_ACTIVITY_TYPE, which equals "tool-call-progress". Optional fields (progress, loaded, total, message, parent_node_id, parent_call_id) are omitted from JSON when None.

All three reporting methods are no-ops when the context has no activity_sink configured.

Verify

Subscribe to the SSE event stream and confirm that ActivitySnapshot events with activity_type = "tool-call-progress" appear while the tool executes, and that the status field transitions from "running" to "done".

Common Errors

ErrorCauseFix
No events appearactivity_sink is None on the contextEnsure the runtime is configured with an event sink
Progress not updating in frontendFrontend does not handle ActivitySnapshot with this activity typeFilter on activity_type == "tool-call-progress"

Key Files

  • crates/awaken-contract/src/contract/progress.rsToolCallProgressState, ProgressStatus, TOOL_CALL_PROGRESS_ACTIVITY_TYPE
  • crates/awaken-contract/src/contract/tool.rsToolCallContext::report_progress, report_activity, report_activity_delta

Testing Strategy

Use this when you need to test tools, plugins, state keys, or full agent runs without depending on a live LLM.

Prerequisites

  • awaken crate added to Cargo.toml (with the runtime re-exports)
  • tokio with rt and macros features for async tests
  • serde_json for constructing tool arguments and assertions

1. Unit testing a Tool

Create a ToolCallContext with a test snapshot, call tool.execute(), and assert on the returned ToolOutput.

use async_trait::async_trait;
use serde_json::{Value, json};
use awaken::contract::tool::{
    Tool, ToolCallContext, ToolDescriptor, ToolError, ToolOutput, ToolResult,
};

struct GreetTool;

#[async_trait]
impl Tool for GreetTool {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "greet", "Greet a user by name")
    }

    async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        let name = args["name"]
            .as_str()
            .ok_or_else(|| ToolError::InvalidArguments("missing 'name'".into()))?;
        Ok(ToolResult::success("greet", json!({ "greeting": format!("Hello, {name}!") })).into())
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[tokio::test]
    async fn greet_tool_returns_greeting() {
        let tool = GreetTool;
        let ctx = ToolCallContext::test_default();

        let output = tool.execute(json!({"name": "Alice"}), &ctx).await.unwrap();

        assert!(output.result.is_success());
        assert_eq!(output.result.data["greeting"], "Hello, Alice!");
    }

    #[tokio::test]
    async fn greet_tool_rejects_missing_name() {
        let tool = GreetTool;
        let ctx = ToolCallContext::test_default();

        let err = tool.execute(json!({}), &ctx).await.unwrap_err();

        assert!(matches!(err, ToolError::InvalidArguments(_)));
    }
}

When your tool returns side-effects via ToolOutput::with_command(), assert on the command field:

#[tokio::test]
async fn tool_produces_state_command() {
    let tool = CounterMutationTool { increment: 5 };
    let ctx = ToolCallContext::test_default();

    let output = tool.execute(json!({}), &ctx).await.unwrap();

    assert!(output.result.is_success());
    // The command is opaque at this level; integration tests (section 4)
    // verify that commands are applied correctly to the StateStore.
    assert!(!output.command.is_empty());
}

2. Unit testing a Plugin

Verify that a plugin registers the expected state keys and hooks by creating a PluginRegistrar and calling plugin.register().

use awaken::contract::StateError;
use awaken::state::{StateKey, MergeStrategy, StateKeyOptions, StateCommand};
use awaken::plugins::{Plugin, PluginDescriptor, PluginRegistrar};
use serde::{Serialize, Deserialize};

// -- State key --

struct Counter;

impl StateKey for Counter {
    const KEY: &'static str = "test.counter";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    type Value = usize;
    type Update = usize;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

// -- Plugin --

struct CounterPlugin;

impl Plugin for CounterPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor { name: "counter" }
    }

    fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
        r.register_key::<Counter>(StateKeyOptions::default())?;
        Ok(())
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    use awaken::state::StateStore;

    #[test]
    fn counter_plugin_registers_key() {
        let store = StateStore::new();
        // install_plugin calls register() internally
        store.install_plugin(CounterPlugin).unwrap();

        // The key should now be readable (returns the default value)
        let val = store.read::<Counter>().unwrap_or_default();
        assert_eq!(val, 0);
    }

    #[test]
    fn counter_plugin_rejects_double_registration() {
        let store = StateStore::new();
        store.install_plugin(CounterPlugin).unwrap();

        // Registering the same key again should fail
        let result = store.install_plugin(CounterPlugin);
        assert!(result.is_err());
    }
}

To test a phase hook directly, build a minimal PhaseContext and inspect the returned StateCommand:

#[tokio::test]
async fn audit_hook_appends_entry() {
    let hook = AuditHook;
    let ctx = PhaseContext::test_default();

    let cmd = hook.run(&ctx).await.unwrap();

    // Commit the command to a test store and verify
    let store = StateStore::new();
    store.install_plugin(AuditPlugin).unwrap();
    store.commit(cmd).unwrap();

    let log = store.read::<AuditLogKey>().unwrap();
    assert!(!log.entries.is_empty());
}

3. Unit testing a StateKey

Test apply() mutations directly without any runtime overhead:

use awaken::state::{StateKey, MergeStrategy};

struct HitCounter;

impl StateKey for HitCounter {
    const KEY: &'static str = "test.hit_counter";
    const MERGE: MergeStrategy = MergeStrategy::Commutative;
    type Value = u64;
    type Update = u64;

    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn apply_increments_counter() {
        let mut value = u64::default();
        HitCounter::apply(&mut value, 5);
        assert_eq!(value, 5);
        HitCounter::apply(&mut value, 3);
        assert_eq!(value, 8);
    }

    #[test]
    fn apply_from_default_is_identity_free() {
        let mut value = u64::default();
        HitCounter::apply(&mut value, 0);
        assert_eq!(value, 0);
    }
}

For keys with complex value types, test edge cases like empty collections or merge conflicts:

#[test]
fn apply_merge_replaces_entries() {
    let mut log = AuditLog { entries: vec!["old".into()] };
    AuditLogKey::apply(&mut log, AuditLog { entries: vec!["new".into()] });
    // Exclusive merge replaces the entire value
    assert_eq!(log.entries, vec!["new"]);
}

4. Integration testing with a mock LLM

Build a full agent runtime with a scripted LlmExecutor that returns canned responses. This is the primary pattern used in the awaken-runtime integration tests.

use std::sync::{Arc, Mutex};
use async_trait::async_trait;
use serde_json::json;

use awaken::contract::content::ContentBlock;
use awaken::contract::event_sink::VecEventSink;
use awaken::contract::executor::{InferenceExecutionError, InferenceRequest, LlmExecutor};
use awaken::contract::identity::{RunIdentity, RunOrigin};
use awaken::contract::inference::{StopReason, StreamResult};
use awaken::contract::message::{Message, ToolCall};
use awaken::contract::tool::{
    Tool, ToolCallContext, ToolDescriptor, ToolError, ToolOutput, ToolResult,
};
use awaken::loop_runner::{AgentLoopParams, LoopStatePlugin, build_agent_env, run_agent_loop};
use awaken::registry::{AgentResolver, ResolvedAgent};
use awaken::state::StateStore;
use awaken::phase::PhaseRuntime;
use awaken::RuntimeError;

// -- Scripted LLM executor --

struct ScriptedLlm {
    responses: Mutex<Vec<StreamResult>>,
}

impl ScriptedLlm {
    fn new(responses: Vec<StreamResult>) -> Self {
        Self {
            responses: Mutex::new(responses),
        }
    }
}

#[async_trait]
impl LlmExecutor for ScriptedLlm {
    async fn execute(
        &self,
        _request: InferenceRequest,
    ) -> Result<StreamResult, InferenceExecutionError> {
        let mut responses = self.responses.lock().unwrap();
        if responses.is_empty() {
            // Fallback: end the conversation
            Ok(StreamResult {
                content: vec![ContentBlock::text("Done.")],
                tool_calls: vec![],
                usage: None,
                stop_reason: Some(StopReason::EndTurn),
                has_incomplete_tool_calls: false,
            })
        } else {
            Ok(responses.remove(0))
        }
    }

    fn name(&self) -> &str {
        "scripted"
    }
}

// -- Resolver --

struct FixedResolver {
    agent: ResolvedAgent,
}

impl AgentResolver for FixedResolver {
    fn resolve(&self, _agent_id: &str) -> Result<ResolvedAgent, RuntimeError> {
        let mut agent = self.agent.clone();
        agent.env = build_agent_env(&[], &agent)?;
        Ok(agent)
    }
}

// -- Helpers --

fn tool_step(calls: Vec<ToolCall>) -> StreamResult {
    StreamResult {
        content: vec![],
        tool_calls: calls,
        usage: None,
        stop_reason: Some(StopReason::ToolUse),
        has_incomplete_tool_calls: false,
    }
}

fn text_step(text: &str) -> StreamResult {
    StreamResult {
        content: vec![ContentBlock::text(text)],
        tool_calls: vec![],
        usage: None,
        stop_reason: Some(StopReason::EndTurn),
        has_incomplete_tool_calls: false,
    }
}

fn test_identity() -> RunIdentity {
    RunIdentity::new(
        "thread-test".into(),
        None,
        "run-test".into(),
        None,
        "agent".into(),
        RunOrigin::User,
    )
}

// -- Test --

#[tokio::test]
async fn tool_call_flow_end_to_end() {
    // Script: LLM calls get_weather, then responds with text
    let llm = Arc::new(ScriptedLlm::new(vec![
        tool_step(vec![ToolCall::new("c1", "get_weather", json!({"city": "Tokyo"}))]),
        text_step("The weather in Tokyo is sunny."),
    ]));

    let agent = ResolvedAgent::new("test", "model", "You are helpful.", llm)
        .with_tool(Arc::new(GetWeatherTool));

    let store = StateStore::new();
    let runtime = PhaseRuntime::new(store.clone()).unwrap();
    store.install_plugin(LoopStatePlugin).unwrap();

    let resolver = FixedResolver { agent };
    let sink = Arc::new(VecEventSink::new());

    let result = run_agent_loop(AgentLoopParams {
        resolver: &resolver,
        agent_id: "test",
        runtime: &runtime,
        sink: sink.clone(),
        checkpoint_store: None,
        messages: vec![Message::user("What's the weather?")],
        run_identity: test_identity(),
        cancellation_token: None,
        decision_rx: None,
        overrides: None,
        frontend_tools: Vec::new(),
    })
    .await
    .unwrap();

    assert_eq!(result.response, "The weather in Tokyo is sunny.");
    assert_eq!(result.steps, 2); // tool step + text step
}

For simpler cases, use the built-in MockLlmExecutor which returns text-only responses:

use awaken::engine::MockLlmExecutor;

let llm = Arc::new(MockLlmExecutor::new().with_responses(vec!["Hello!".into()]));
let agent = ResolvedAgent::new("test", "model", "system prompt", llm);

5. Testing event streams

Use VecEventSink to capture all events emitted during a run and assert on their sequence and content.

use awaken::contract::event::AgentEvent;
use awaken::contract::event_sink::VecEventSink;
use awaken::contract::lifecycle::TerminationReason;

#[tokio::test]
async fn events_follow_expected_lifecycle() {
    // ... set up runtime and run agent (see section 4) ...

    let events = sink.take();

    // Verify ordering: RunStart -> StepStart -> ... -> StepEnd -> RunFinish
    assert!(matches!(events.first(), Some(AgentEvent::RunStart { .. })));
    assert!(matches!(events.last(), Some(AgentEvent::RunFinish { .. })));

    // Count specific event types
    let step_starts = events.iter().filter(|e| matches!(e, AgentEvent::StepStart { .. })).count();
    let step_ends = events.iter().filter(|e| matches!(e, AgentEvent::StepEnd)).count();
    assert_eq!(step_starts, step_ends, "every StepStart needs a StepEnd");

    // Verify termination reason
    if let Some(AgentEvent::RunFinish { termination, .. }) = events.last() {
        assert_eq!(*termination, TerminationReason::NaturalEnd);
    }

    // Check that tool call events appear in the correct order
    let has_tool_start = events.iter().any(|e| matches!(e, AgentEvent::ToolCallStart { .. }));
    let has_tool_done = events.iter().any(|e| matches!(e, AgentEvent::ToolCallDone { .. }));
    if has_tool_start {
        assert!(has_tool_done, "ToolCallStart without ToolCallDone");
        let start_idx = events.iter().position(|e| matches!(e, AgentEvent::ToolCallStart { .. })).unwrap();
        let done_idx = events.iter().position(|e| matches!(e, AgentEvent::ToolCallDone { .. })).unwrap();
        assert!(start_idx < done_idx);
    }
}

A reusable helper for event type extraction (used in the runtime test suite):

fn event_type(e: &AgentEvent) -> &'static str {
    match e {
        AgentEvent::RunStart { .. } => "run_start",
        AgentEvent::RunFinish { .. } => "run_finish",
        AgentEvent::StepStart { .. } => "step_start",
        AgentEvent::StepEnd => "step_end",
        AgentEvent::TextDelta { .. } => "text_delta",
        AgentEvent::ToolCallStart { .. } => "tool_call_start",
        AgentEvent::ToolCallDone { .. } => "tool_call_done",
        AgentEvent::InferenceComplete { .. } => "inference_complete",
        AgentEvent::StateSnapshot { .. } => "state_snapshot",
        _ => "other",
    }
}

let types: Vec<&str> = events.iter().map(event_type).collect();
assert_eq!(types[0], "run_start");
assert_eq!(*types.last().unwrap(), "run_finish");

6. Testing with a real LLM (live tests)

For end-to-end validation against a real provider, use the GenaiExecutor with environment variables for credentials. Mark these tests with #[ignore] so they only run when explicitly requested.

use awaken::engine::GenaiExecutor;

#[tokio::test]
#[ignore] // Run with: cargo test -- --ignored
async fn live_llm_responds() {
    // Requires: OPENAI_API_KEY or (LLM_BASE_URL + LLM_API_KEY)
    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o-mini".into());
    let llm = Arc::new(GenaiExecutor::new());

    let agent = ResolvedAgent::new(
        "live-test",
        &model,
        "You are a test assistant. Answer in one word.",
        llm,
    );

    // ... set up resolver, store, runtime, sink as in section 4 ...

    let result = run_agent_loop(AgentLoopParams {
        resolver: &resolver,
        agent_id: "live-test",
        runtime: &runtime,
        sink: sink.clone(),
        checkpoint_store: None,
        messages: vec![Message::user("What is 2+2? Answer in one word.")],
        run_identity: test_identity(),
        cancellation_token: None,
        decision_rx: None,
        overrides: None,
        frontend_tools: Vec::new(),
    })
    .await
    .unwrap();

    assert!(!result.response.is_empty());
}

Run live tests with:

# OpenAI-compatible provider
OPENAI_API_KEY=<your-key> LLM_MODEL=gpt-4o-mini cargo test -- --ignored

# Custom endpoint (e.g. BigModel)
LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/ \
  LLM_API_KEY=<key> \
  LLM_MODEL=GLM-4.7-Flash \
  cargo test -- --ignored

See examples/live_test.rs and examples/tool_call_live.rs for complete working examples with console output.

Key Files

  • crates/awaken-contract/src/contract/tool.rsTool trait, ToolCallContext::test_default(), ToolResult, ToolOutput
  • crates/awaken-contract/src/contract/event_sink.rsVecEventSink
  • crates/awaken-runtime/src/engine/mock.rsMockLlmExecutor
  • crates/awaken-runtime/src/state/mod.rsStateStore, StateCommand
  • crates/awaken-runtime/src/loop_runner/mod.rsrun_agent_loop, AgentLoopParams, AgentRunResult
  • crates/awaken-runtime/tests/ – integration test suite (event lifecycle, tool side effects)

Use Shared State

Use this when agents need to share persistent state across thread boundaries, agent types, or delegation trees. Shared state lives in the ProfileStore and is addressed by a typed namespace (ProfileKey) and a key (&str), giving you fine-grained control over who sees what.

Prerequisites

  • A working awaken agent runtime (see First Agent)
  • A ProfileStore backend configured on the runtime (e.g. file store or Postgres)

Concepts

Shared state has two dimensions:

DimensionTypePurpose
NamespaceProfileKeyDefines what is stored — a compile-time binding between a static string key (KEY) and a typed Value. Each key is registered once per plugin via register_profile_key.
Key&str (or StateScope helper)Defines which instance — a runtime string that partitions storage. Different keys isolate or share data between agents and threads.

Together, (ProfileKey::KEY, key: &str) uniquely identifies a shared state entry in the profile store.

Steps

1. Define a shared state key

Create a struct that implements ProfileKey. The KEY constant is the namespace; the Value type is what gets serialized.

use serde::{Deserialize, Serialize};
use awaken_contract::ProfileKey;

#[derive(Clone, Default, Serialize, Deserialize)]
pub struct TeamContext {
    pub goal: String,
    pub constraints: Vec<String>,
}

pub struct TeamContextKey;

impl ProfileKey for TeamContextKey {
    const KEY: &'static str = "team_context";
    type Value = TeamContext;
}

2. Register in a plugin

Inside your plugin’s register method, call register_profile_key on the registrar.

use awaken_contract::StateError;
use awaken_runtime::plugins::registry::PluginRegistrar;

fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_profile_key::<TeamContextKey>()?;
    Ok(())
}

3. Read and write in a hook

In any phase hook, obtain ProfileAccess from the context and use read / write with a key string. StateScope is a convenience builder for common key patterns — call .as_str() to get the key.

use awaken_contract::StateScope;

async fn execute(&self, ctx: &mut PhaseContext) -> Result<(), anyhow::Error> {
    let profile = ctx.profile().expect("ProfileStore not configured");
    let identity = ctx.snapshot().run_identity();

    // Build a scope key from the current agent's parent thread
    let scope = match &identity.parent_thread_id {
        Some(pid) => StateScope::parent_thread(pid),
        None => StateScope::global(),
    };

    // Read (returns TeamContext::default() if missing)
    let mut team: TeamContext = profile.read::<TeamContextKey>(scope.as_str()).await?;

    // Mutate and write back
    team.goal = "Ship the feature".into();
    profile.write::<TeamContextKey>(scope.as_str(), &team).await?;

    Ok(())
}

4. Choose the right scope

StateScope has several constructors. Pick the one that matches your sharing pattern:

ScenarioScopeExample
All agents across all threadsStateScope::global()Org-wide configuration
All agents spawned from the same parent threadStateScope::parent_thread(id)A delegation tree sharing context
All instances of the same agent typeStateScope::agent_type(name)Planner agents sharing learned heuristics
Single thread onlyStateScope::thread(id)Thread-local scratchpad
Custom partitionStateScope::new("custom-key")Any application-defined grouping

You can also pass any raw &str directly — StateScope is optional convenience.

When to use shared state

MechanismLifetimeScopeBest for
StateKeySingle run (in-memory snapshot)One agent threadTransient per-run state (counters, flags, accumulated context)
ProfileKey with agent/system keyPersistent (profile store)Per-agent or systemPer-agent or per-user settings that don’t cross boundaries
ProfileKey with StateScope keyPersistent (profile store)Any StateScope stringCross-agent, cross-thread persistent state

Use ProfileKey with a StateScope key when state must survive across runs and be visible to agents in different threads or of different types.

Common Errors

SymptomCauseFix
profile key not registered: <ns>Key not registered in any pluginCall r.register_profile_key::<YourKey>() in the plugin’s register method
Always reads Value::default()Writing and reading use different key stringsVerify both sides construct the same StateScope or use the same &str key
Data leaks between scopesUsing StateScope::global() when a narrower scope is neededSwitch to parent_thread, agent_type, or thread scope

Key Files

PathPurpose
crates/awaken-contract/src/contract/shared_state.rsStateScope type
crates/awaken-contract/src/contract/profile_store.rsProfileKey trait, ProfileOwner enum
crates/awaken-runtime/src/profile/mod.rsProfileAccess with read, write, delete methods
crates/awaken-runtime/src/plugins/registry.rsPluginRegistrar::register_profile_key registration

Overview

The awaken crate is the public facade for the Awaken agent framework. It re-exports types from the internal awaken-contract and awaken-runtime crates so that downstream code only needs a single dependency.

Module re-exports

Facade pathSource crateContents
awaken::contractawaken-contractTool trait, events, messages, suspension, lifecycle
awaken::modelawaken-contractPhase, EffectSpec, ScheduledActionSpec, JsonValue
awaken::registry_specawaken-contractAgentSpec, ModelSpec, ProviderSpec, McpServerSpec, PluginConfigKey
awaken::stateawaken-contract + awaken-runtimeStateKey, StateMap, Snapshot, StateStore, MutationBatch
awaken::agentawaken-runtimeAgent configuration and state
awaken::builderawaken-runtimeAgentRuntimeBuilder, BuildError
awaken::contextawaken-runtimePhaseContext
awaken::engineawaken-runtimeLLM engine abstraction
awaken::executionawaken-runtimeExecutionEnv
awaken::extensionsawaken-runtimeBuilt-in extension infrastructure
awaken::loop_runnerawaken-runtimeAgent loop runner
awaken::phaseawaken-runtimePhaseRuntime, PhaseHook
awaken::pluginsawaken-runtimePlugin, PluginDescriptor, PluginRegistrar
awaken::policiesawaken-runtimeContext window and retry policies
awaken::registryawaken-runtimeAgentResolver, ResolvedAgent
awaken::runtimeawaken-runtimeAgentRuntime
awaken::storesawaken-storesFile and Postgres store implementations

Feature-gated modules

Facade pathFeature flagSource crate
awaken::ext_permissionpermissionawaken-ext-permission
awaken::ext_observabilityobservabilityawaken-ext-observability
awaken::ext_mcpmcpawaken-ext-mcp
awaken::ext_skillsskillsawaken-ext-skills
awaken::ext_generative_uigenerative-uiawaken-ext-generative-ui
awaken::ext_reminderreminderawaken-ext-reminder
awaken::serverserverawaken-server

Root-level re-exports

The following types are re-exported at the crate root for convenience:

From awaken-contract: AgentSpec, EffectSpec, FailedScheduledActions, JsonValue, KeyScope, MergeStrategy, PendingScheduledActions, PersistedState, Phase, PluginConfigKey, ScheduledActionSpec, Snapshot, StateError, StateKey, StateKeyOptions, StateMap, TypedEffect, UnknownKeyPolicy

From awaken-runtime: AgentResolver, AgentRuntime, AgentRuntimeBuilder, BuildError, CancellationToken, CommitEvent, CommitHook, DEFAULT_MAX_PHASE_ROUNDS, ExecutionEnv, MutationBatch, PhaseContext, PhaseHook, PhaseRuntime, Plugin, PluginDescriptor, PluginRegistrar, ResolvedAgent, RunRequest, RuntimeError, StateCommand, StateStore, TypedEffectHandler, TypedScheduledActionHandler

Feature flags

FlagDefaultDescription
permissionyesTool-level permission gating (HITL)
observabilityyesTracing and metrics integration
mcpyesMCP (Model Context Protocol) tool bridge
skillsyesSkills subsystem for reusable agent capabilities
reminderyesReminder extension for injecting context messages
serveryesHTTP server with SSE streaming and protocol adapters
generative-uiyesGenerative UI component streaming
fullyesEnables all of the above

Tool Trait

The Tool trait is the primary extension point for giving agents capabilities. Tools are stateless functions that receive JSON arguments and a read-only context, and return a ToolOutput.

Trait definition

#[async_trait]
pub trait Tool: Send + Sync {
    /// Return the descriptor for this tool.
    fn descriptor(&self) -> ToolDescriptor;

    /// Validate arguments before execution. Default: accept all.
    fn validate_args(&self, _args: &Value) -> Result<(), ToolError> {
        Ok(())
    }

    /// Execute the tool with the given arguments and context.
    async fn execute(
        &self,
        args: Value,
        ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError>;
}

Crate path: awaken::contract::tool::Tool

ToolDescriptor

Describes a tool’s identity and parameter schema. Registered with the runtime and sent to the LLM as available functions.

pub struct ToolDescriptor {
    pub id: String,
    pub name: String,
    pub description: String,
    /// JSON Schema for parameters.
    pub parameters: Value,
    pub category: Option<String>,
}

Builder methods:

ToolDescriptor::new(id, name, description) -> Self
    .with_parameters(schema: Value) -> Self
    .with_category(category: impl Into<String>) -> Self

ToolResult

Returned by Tool::execute. Carries the execution outcome back to the agent loop.

pub struct ToolResult {
    pub tool_name: String,
    pub status: ToolStatus,
    pub data: Value,
    pub message: Option<String>,
    pub suspension: Option<Box<SuspendTicket>>,
}

ToolStatus

pub enum ToolStatus {
    Success,  // Execution succeeded
    Pending,  // Suspended, waiting for external resume
    Error,    // Execution failed; content sent back to LLM
}

Constructors

MethodStatusUse case
ToolResult::success(name, data)SuccessNormal completion
ToolResult::success_with_message(name, data, msg)SuccessCompletion with description
ToolResult::error(name, message)ErrorRecoverable failure
ToolResult::error_with_code(name, code, message)ErrorStructured error with code
ToolResult::suspended(name, message)PendingHITL suspension
ToolResult::suspended_with(name, message, ticket)PendingSuspension with ticket

Predicates

  • is_success() -> bool
  • is_pending() -> bool
  • is_error() -> bool
  • to_json() -> Value

ToolError

Errors returned from validate_args or execute. Unlike ToolResult::Error (which is sent to the LLM), a ToolError aborts the tool call.

pub enum ToolError {
    InvalidArguments(String),
    ExecutionFailed(String),
    Denied(String),
    NotFound(String),
    Internal(String),
}

ToolCallContext

Read-only context provided to a tool during execution.

pub struct ToolCallContext {
    pub call_id: String,
    pub tool_name: String,
    pub run_identity: RunIdentity,
    pub agent_spec: Arc<AgentSpec>,
    pub snapshot: Snapshot,
    pub activity_sink: Option<Arc<dyn EventSink>>,
}

Methods

/// Read a typed state key from the snapshot.
fn state<K: StateKey>(&self) -> Option<&K::Value>

/// Report an activity snapshot for this tool call.
async fn report_activity(&self, activity_type: &str, content: &str)

/// Report an incremental activity delta.
async fn report_activity_delta(&self, activity_type: &str, patch: Value)

/// Report structured tool call progress.
async fn report_progress(
    &self,
    status: ProgressStatus,
    message: Option<&str>,
    progress: Option<f64>,
)

Examples

Minimal tool

use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use serde_json::{Value, json};

struct Greet;

#[async_trait]
impl Tool for Greet {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("greet", "greet", "Greet a user by name")
            .with_parameters(json!({
                "type": "object",
                "properties": {
                    "name": { "type": "string" }
                },
                "required": ["name"]
            }))
    }

    async fn execute(
        &self,
        args: Value,
        _ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        let name = args["name"]
            .as_str()
            .ok_or_else(|| ToolError::InvalidArguments("name required".into()))?;
        Ok(ToolResult::success("greet", json!({ "greeting": format!("Hello, {name}!") })).into())
    }
}

Reading state from context

use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use awaken::state::StateKey;
use serde::{Serialize, Deserialize};
use serde_json::{Value, json};

// Assume a state key is defined elsewhere:
// struct UserPreferences;
// impl StateKey for UserPreferences { ... }

struct GetPreferences;

#[async_trait]
impl Tool for GetPreferences {
    fn descriptor(&self) -> ToolDescriptor {
        ToolDescriptor::new("get_prefs", "get_preferences", "Get user preferences")
    }

    async fn execute(
        &self,
        _args: Value,
        ctx: &ToolCallContext,
    ) -> Result<ToolOutput, ToolError> {
        // Read typed state through the snapshot
        // let prefs = ctx.state::<UserPreferences>().cloned().unwrap_or_default();
        // Ok(ToolResult::success("get_prefs", serde_json::to_value(&prefs).unwrap()).into())
        Ok(ToolResult::success("get_prefs", json!({})).into())
    }
}

Tool execution hooks

Every tool call passes through plugin hooks before and after execution. This allows plugins to intercept, observe, or modify tool call behavior without changing the tool itself.

Full lifecycle

LLM selects tool
  -> validate_args()
  -> BeforeToolExecute phase (plugins run hooks)
     Plugins may schedule ToolInterceptAction:
       Block    -> run terminates with reason
       Suspend  -> run pauses (HITL), tool not executed
       SetResult -> tool skipped, pre-built result used
  -> execute()                (only if not intercepted)
  -> AfterToolExecute phase   (plugins run hooks)
     Plugins observe the ToolResult and may modify state

BeforeToolExecute

Runs once per tool call, after argument validation. Plugins receive a PhaseContext containing the tool name, call ID, and validated arguments. Hooks return a StateCommand that may schedule ToolInterceptAction to block, suspend, or short-circuit the call.

When multiple intercepts are scheduled, priority is: Block > Suspend > SetResult.

AfterToolExecute

Runs after execute() completes (or after an intercept produces a result). Plugins receive the PhaseContext with the ToolResult attached. Hooks can update state, emit events, or schedule actions for subsequent phases.

ToolCallStatus transitions

Each tool call tracks a ToolCallStatus through its lifecycle:

New -> Running -> Succeeded
                  Failed
                  Suspended -> Resuming -> Running -> ...
                  Cancelled

Terminal states (Succeeded, Failed, Cancelled) cannot transition further.

See Plugin Internals for intercept priority details and the full phase convergence loop.

Scheduled Actions

Scheduled actions are the primary mechanism for plugins and tools to request side-effects during a phase execution cycle. Any hook, tool, or external module can schedule an action via StateCommand::schedule_action::<A>(payload). The runtime dispatches the action to its registered handler during the EXECUTE stage of the target phase.

How it works

Hook / Tool                    Runtime
    |                            |
    |-- StateCommand ----------->|  (contains scheduled_actions)
    |   schedule_action::<A>(p)  |
    |                            |-- commit state updates
    |                            |-- dispatch to handler(A, p)
    |                            |      |
    |                            |      |-- handler returns StateCommand
    |                            |      |   (may schedule more actions)
    |                            |<-----'
    |                            |-- commit handler results

Scheduling from a hook

use awaken_runtime::agent::state::ExcludeTool;

async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
    let mut cmd = StateCommand::new();
    cmd.schedule_action::<ExcludeTool>("dangerous_tool".into())?;
    Ok(cmd)
}

Scheduling from a tool

use awaken_runtime::agent::state::AddContextMessage;
use awaken_contract::contract::context_message::ContextMessage;

async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
    let mut cmd = StateCommand::new();
    cmd.schedule_action::<AddContextMessage>(
        ContextMessage::system("my_tool.hint", "Remember to check the docs."),
    )?;
    Ok(ToolOutput::with_command(
        ToolResult::success("my_tool", json!({"ok": true})),
        cmd,
    ))
}

Core Actions (awaken-runtime)

These are always available. Registered by the internal LoopActionHandlersPlugin.

AddContextMessage

Keyruntime.add_context_message
PhaseBeforeInference
PayloadContextMessage
Importawaken_runtime::agent::state::AddContextMessage

Injects a context message into the LLM conversation for the current step. Messages can be persistent (survive across steps), ephemeral (one-shot), or throttled (cooldown-based).

Used by: skills plugin (skill catalog), reminder plugin (rule-based hints), deferred-tools plugin (deferred tool list), custom hooks.

cmd.schedule_action::<AddContextMessage>(
    ContextMessage::system_persistent("my_plugin.info", "Always verify inputs."),
)?;

SetInferenceOverride

Keyruntime.set_inference_override
PhaseBeforeInference
PayloadInferenceOverride
Importawaken_runtime::agent::state::SetInferenceOverride

Overrides inference parameters (model, temperature, max_tokens, top_p) for the current step only. Multiple overrides are merged; last-writer-wins per field.

cmd.schedule_action::<SetInferenceOverride>(InferenceOverride {
    temperature: Some(0.0),  // force deterministic
    ..Default::default()
})?;

ExcludeTool

Keyruntime.exclude_tool
PhaseBeforeInference
PayloadString (tool ID)
Importawaken_runtime::agent::state::ExcludeTool

Removes a tool from the set offered to the LLM for the current step. Multiple exclusions are additive.

Used by: permission plugin (unconditionally denied tools), deferred-tools plugin (deferred tools replaced by ToolSearch).

cmd.schedule_action::<ExcludeTool>("rm".into())?;

IncludeOnlyTools

Keyruntime.include_only_tools
PhaseBeforeInference
PayloadVec<String> (tool IDs)
Importawaken_runtime::agent::state::IncludeOnlyTools

Restricts the tool set to only the listed IDs. Multiple IncludeOnlyTools actions are unioned. Combined with ExcludeTool, exclusions are applied after the include-only filter.

cmd.schedule_action::<IncludeOnlyTools>(vec!["search".into(), "calculator".into()])?;

ToolInterceptAction

Keytool_intercept
PhaseBeforeToolExecute
PayloadToolInterceptPayload
Importawaken_contract::contract::tool_intercept::ToolInterceptAction

Intercepts a tool call before execution. Three outcomes:

VariantEffect
Block { reason }Tool execution blocked, run terminates with the reason.
Suspend(SuspendTicket)Tool execution deferred; run pauses awaiting external decision (HITL).
SetResult(ToolResult)Tool execution short-circuited with a pre-built result.

When multiple intercepts are scheduled, priority is: Block > Suspend > SetResult.

Used by: permission plugin (ask/deny policies).

use awaken_contract::contract::tool_intercept::{ToolInterceptAction, ToolInterceptPayload};

cmd.schedule_action::<ToolInterceptAction>(
    ToolInterceptPayload::Block { reason: "Tool is disabled".into() },
)?;

Deferred-Tools Actions (awaken-ext-deferred-tools)

Available when the deferred-tools plugin is active.

DeferToolAction

Keydeferred_tools.defer
PhaseBeforeInference
PayloadVec<String> (tool IDs)
Importawaken_ext_deferred_tools::state::DeferToolAction

Moves tools from eager to deferred mode. Deferred tools are excluded from the LLM tool set and made available via ToolSearch instead, reducing prompt token usage.

The handler updates DeferralState, setting each tool’s mode to Deferred.

cmd.schedule_action::<DeferToolAction>(vec!["rarely_used_tool".into()])?;

PromoteToolAction

Keydeferred_tools.promote
PhaseBeforeInference
PayloadVec<String> (tool IDs)
Importawaken_ext_deferred_tools::state::PromoteToolAction

Moves tools from deferred to eager mode. Promoted tools are included in the LLM tool set for subsequent steps.

The handler updates DeferralState, setting each tool’s mode to Eager.

Typically triggered automatically when ToolSearch returns results, but can be scheduled manually by any plugin or tool.

cmd.schedule_action::<PromoteToolAction>(vec!["needed_tool".into()])?;

Plugin Action Usage Matrix

Which plugins schedule which actions:

PluginAddContextSetOverrideExcludeIncludeOnlyInterceptDeferPromote
permissionXX
skillsX
reminderX
deferred-toolsXXXX
observability
mcp
generative-ui

Defining Custom Actions

Plugins can define their own actions by implementing ScheduledActionSpec and registering a handler via PluginRegistrar::register_scheduled_action.

use awaken_contract::model::{Phase, ScheduledActionSpec};

pub struct MyCustomAction;

impl ScheduledActionSpec for MyCustomAction {
    const KEY: &'static str = "my_plugin.custom_action";
    const PHASE: Phase = Phase::BeforeInference;
    type Payload = MyPayload;
}

Then in your plugin’s register():

fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_scheduled_action::<MyCustomAction, _>(MyHandler)?;
    Ok(())
}

Other plugins and tools can then schedule your action:

cmd.schedule_action::<MyCustomAction>(my_payload)?;

Convergence and cascading

Scheduled actions execute within the phase convergence loop. After each round of action dispatch, the runtime checks whether new actions were produced. If so, the loop repeats to process them.

How the loop works

Phase EXECUTE stage:
  round 1: dispatch queued actions -> handlers return StateCommands
           commit state, collect newly scheduled actions
  round 2: dispatch new actions    -> handlers may schedule more
           ...
  round N: no new actions          -> phase converges, loop exits

An action handler can schedule new actions for the same phase, which causes another round. This enables cascading behaviors (e.g., a handler adds a context message, which triggers a filter action from another plugin).

Limits

The loop is bounded by DEFAULT_MAX_PHASE_ROUNDS (16). If actions are still being produced after 16 rounds, the runtime returns a StateError::PhaseRunLoopExceeded error with the phase name and round count. This prevents infinite loops from misconfigured or recursive handlers.

Failed actions

When an action handler returns an error, the action is not retried. Instead, it is recorded in the FailedScheduledActions state key, which holds a list of FailedScheduledAction entries (action key, payload, and error message). Plugins or tests can inspect this key to detect handler failures.

let failed = store.read::<FailedScheduledActions>().unwrap_or_default();
assert!(failed.is_empty(), "expected no failed actions");

See Plugin Internals for the full convergence loop description and phase execution model.

Effects

Effects are typed, fire-and-forget side-effect events. Unlike scheduled actions (which execute within a phase convergence loop and can cascade), effects are dispatched after commit and are terminal – handlers cannot produce new StateCommands, actions, or effects.

Typical use cases: audit logging, external webhook calls, metric emission, notification delivery.

EffectSpec trait

Every effect type implements EffectSpec:

pub trait EffectSpec: 'static + Send + Sync {
    /// Unique string identifier for this effect kind.
    const KEY: &'static str;

    /// The payload carried by the effect.
    /// Must be serializable so the runtime can store it as JSON internally.
    type Payload: Serialize + DeserializeOwned + Send + Sync + 'static;
}

Crate path: awaken::model::EffectSpec (re-exported from awaken-contract)

KEY must be globally unique across all registered effects. Convention: "<plugin>.<effect_name>", e.g. "audit.record".

Emitting effects

Call StateCommand::emit::<E>(payload) from any hook or action handler:

use awaken::{StateCommand, StateError};

async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
    let mut cmd = StateCommand::new();
    cmd.emit::<AuditEffect>(AuditPayload {
        action: "user_login".into(),
        actor: "agent-1".into(),
    })?;
    Ok(cmd)
}

Effects are collected in StateCommand::effects and dispatched only after the command’s state mutations are committed. Tools can also emit effects by including them in the StateCommand returned alongside a ToolResult.

TypedEffect wrapper

TypedEffect is the runtime’s type-erased envelope for effects:

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct TypedEffect {
    pub key: String,
    pub payload: JsonValue,
}

Two methods bridge between the typed and erased worlds:

  • TypedEffect::from_spec::<E>(payload) – serializes a typed payload into a TypedEffect. Called internally by StateCommand::emit.
  • TypedEffect::decode::<E>() – deserializes the JSON payload back into the concrete E::Payload type.

You rarely need to use TypedEffect directly; StateCommand::emit and the handler trait handle serialization transparently.

Registering effect handlers

Effect handlers implement TypedEffectHandler<E>:

#[async_trait]
pub trait TypedEffectHandler<E>: Send + Sync + 'static
where
    E: EffectSpec,
{
    async fn handle_typed(
        &self,
        payload: E::Payload,
        snapshot: &Snapshot,
    ) -> Result<(), String>;
}

Key points:

  • The handler receives the post-commit Snapshot, so it sees the state that includes the mutations from the command that emitted the effect.
  • The return type is Result<(), String> – not Result<(), StateError>. Handlers report errors as plain strings; the runtime logs them but does not propagate them.

Register a handler in your plugin’s register() method:

fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_effect::<AuditEffect, _>(AuditEffectHandler)?;
    Ok(())
}

Duplicate registrations (same E::KEY) produce StateError::EffectHandlerAlreadyRegistered.

Dispatch lifecycle

  1. Collect – A hook, action handler, or tool calls cmd.emit::<E>(payload). The TypedEffect is appended to StateCommand::effects.

  2. Validate – When submit_command processes the command, every effect key is checked against the registered handlers. If any key has no registered handler, the command is rejected with StateError::UnknownEffectHandler before any state is committed. This is a fail-fast guarantee.

  3. Commit – State mutations (MutationBatch) are committed to the store.

  4. Dispatch – After a successful commit, each effect is dispatched to its handler via handle_typed(payload, snapshot). The snapshot reflects post-commit state.

  5. Error handling – Handler failures are logged and counted in EffectDispatchReport but do not roll back the commit or block subsequent effects. The runtime continues dispatching remaining effects.

Hook / Tool                         Runtime
    |                                 |
    |-- StateCommand (with effects) ->|
    |                                 |-- validate all effect keys
    |                                 |   (fail-fast if unknown)
    |                                 |-- commit state mutations
    |                                 |-- dispatch effects sequentially
    |                                 |      handler(payload, snapshot)
    |                                 |-- return SubmitCommandReport
    |<--------------------------------|

Worked example

Define an effect:

use awaken::EffectSpec;
use serde::{Deserialize, Serialize};

/// Payload for audit log entries.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditPayload {
    pub action: String,
    pub actor: String,
}

/// Effect spec for audit logging.
pub struct AuditEffect;

impl EffectSpec for AuditEffect {
    const KEY: &'static str = "audit.record";
    type Payload = AuditPayload;
}

Emit it from a phase hook:

use async_trait::async_trait;
use awaken::{PhaseContext, PhaseHook, StateCommand, StateError};

pub struct AuditHook;

#[async_trait]
impl PhaseHook for AuditHook {
    async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
        let mut cmd = StateCommand::new();
        cmd.emit::<AuditEffect>(AuditPayload {
            action: "phase_entered".into(),
            actor: "system".into(),
        })?;
        Ok(cmd)
    }
}

Handle the effect:

use async_trait::async_trait;
use awaken::{Snapshot, TypedEffectHandler};

pub struct AuditEffectHandler;

#[async_trait]
impl TypedEffectHandler<AuditEffect> for AuditEffectHandler {
    async fn handle_typed(
        &self,
        payload: AuditPayload,
        _snapshot: &Snapshot,
    ) -> Result<(), String> {
        tracing::info!(
            action = %payload.action,
            actor = %payload.actor,
            "audit effect dispatched"
        );
        // In production: write to external audit store, send webhook, etc.
        Ok(())
    }
}

Wire it all together in a plugin:

use awaken::{Plugin, PluginDescriptor, PluginRegistrar, StateError};
use awaken::model::Phase;

pub struct AuditPlugin;

impl Plugin for AuditPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor::new("audit", "Audit logging via effects")
    }

    fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
        r.register_effect::<AuditEffect, _>(AuditEffectHandler)?;
        r.register_phase_hook("audit", Phase::RunStart, AuditHook)?;
        Ok(())
    }
}

Effects vs Scheduled Actions

EffectsScheduled Actions
TimingPost-commitWithin phase convergence loop
Can cascadeNoYes (handlers return StateCommand)
Can produce StateCommandNoYes
Failure handlingLogged, non-blockingError propagated to caller
State visibilityPost-commit snapshotPre-commit context
Use caseExternal I/O, logging, metricsInternal control flow, state manipulation

Choose effects when you need to trigger external side-effects that should not influence the agent’s state convergence. Choose scheduled actions when the handler needs to mutate state or schedule further work.

See also

Events

The agent loop emits AgentEvent values as it executes. Events are streamed to clients via SSE and consumed by protocol encoders.

AgentEvent

All variants are tagged with event_type in their JSON serialization (#[serde(tag = "event_type", rename_all = "snake_case")]).

pub enum AgentEvent {
    RunStart {
        thread_id: String,
        run_id: String,
        parent_run_id: Option<String>,    // omitted when None
    },

    RunFinish {
        thread_id: String,
        run_id: String,
        result: Option<Value>,            // omitted when None
        termination: TerminationReason,
    },

    TextDelta { delta: String },

    ReasoningDelta { delta: String },

    ReasoningEncryptedValue { encrypted_value: String },

    ToolCallStart { id: String, name: String },

    ToolCallDelta { id: String, args_delta: String },

    ToolCallReady {
        id: String,
        name: String,
        arguments: Value,
    },

    ToolCallDone {
        id: String,
        message_id: String,
        result: ToolResult,
        outcome: ToolCallOutcome,
    },

    ToolCallStreamDelta {
        id: String,
        name: String,
        delta: String,
    },

    ToolCallResumed { target_id: String, result: Value },

    MessagesSnapshot { messages: Vec<Value> },

    ActivitySnapshot {
        message_id: String,
        activity_type: String,
        content: Value,
        replace: Option<bool>,            // omitted when None
    },

    ActivityDelta {
        message_id: String,
        activity_type: String,
        patch: Vec<Value>,
    },

    StepStart { message_id: String },

    StepEnd,

    InferenceComplete {
        model: String,
        usage: Option<TokenUsage>,        // omitted when None
        duration_ms: u64,
    },

    StateSnapshot { snapshot: Value },

    StateDelta { delta: Vec<Value> },

    Error {
        message: String,
        code: Option<String>,             // omitted when None
    },
}

Crate path: awaken::contract::event::AgentEvent

Helper

impl AgentEvent {
    /// Extract the response text from a RunFinish result value.
    pub fn extract_response(result: &Option<Value>) -> String
}

StreamEvent

Wire-format envelope that wraps an AgentEvent with sequencing metadata. Sent over SSE as JSON.

pub struct StreamEvent {
    /// Monotonically increasing sequence number within a run.
    pub seq: u64,
    /// ISO 8601 timestamp.
    pub timestamp: String,
    /// The wrapped agent event (flattened via #[serde(flatten)]).
    pub event: AgentEvent,
}

Constructor

fn new(seq: u64, timestamp: impl Into<String>, event: AgentEvent) -> Self

RunInput

Input to start or resume a run.

#[serde(tag = "type", rename_all = "snake_case")]
pub enum RunInput {
    /// A new user message to process.
    UserMessage { text: String },
    /// Resume a suspended run with a decision.
    ResumeDecision {
        tool_call_id: String,
        action: ResumeDecisionAction,
        payload: Value,       // omitted when null
    },
}

RunOutput

Type alias for the event stream returned by a run:

pub type RunOutput = futures::stream::BoxStream<'static, AgentEvent>;

TerminationReason

Why a run terminated. Serialized as { "type": "...", "value": ... }.

#![allow(unused)]
fn main() {
pub struct StoppedReason { pub code: String, pub detail: Option<String> }
pub enum TerminationReason {
    NaturalEnd,
    BehaviorRequested,
    Stopped(StoppedReason),
    Cancelled,
    Blocked(String),
    Suspended,
    Error(String),
}
}

ToolCallOutcome

#![allow(unused)]
fn main() {
pub enum ToolCallOutcome {
    Succeeded,
    Failed,
    Suspended,
}
}

TokenUsage

#![allow(unused)]
fn main() {
pub struct TokenUsage {
    pub prompt_tokens: Option<i32>,
    pub completion_tokens: Option<i32>,
    pub total_tokens: Option<i32>,
    pub cache_read_tokens: Option<i32>,
    pub cache_creation_tokens: Option<i32>,
    pub thinking_tokens: Option<i32>,
}
}

All fields are omitted from JSON when None. TokenUsage::default() produces all None values.

HTTP API

The awaken-server crate (feature flag server) exposes an HTTP API via Axum. Most responses are JSON. Streaming endpoints use Server-Sent Events (SSE).

This page mirrors the current route tree in crates/awaken-server/src/routes.rs and crates/awaken-server/src/config_routes.rs.

Health and metrics

MethodPathDescription
GET/healthReadiness probe. Checks store connectivity and returns 200 or 503
GET/health/liveLiveness probe. Always returns 200 OK
GET/metricsPrometheus scrape endpoint

Threads

MethodPathDescription
GET/v1/threadsList thread IDs
POST/v1/threadsCreate a thread. Body: { "title": "..." }
GET/v1/threads/summariesList thread summaries
GET/v1/threads/:idGet a thread by ID
PATCH/v1/threads/:idUpdate thread metadata
DELETE/v1/threads/:idDelete a thread
POST/v1/threads/:id/cancelCancel a specific queued or running job addressed by this thread ID. Returns cancel_requested.
POST/v1/threads/:id/decisionSubmit a HITL decision for a waiting run on this thread
POST/v1/threads/:id/interruptInterrupt the thread: bumps the thread generation, supersedes all pending queued jobs, and cancels the active run. Returns interrupt_requested with superseded_jobs count. Unlike /cancel, this performs a clean-slate interrupt via mailbox.interrupt().
PATCH/v1/threads/:id/metadataAlias for thread metadata updates
GET/v1/threads/:id/messagesList thread messages
POST/v1/threads/:id/messagesSubmit messages as a background run on this thread
POST/v1/threads/:id/mailboxPush a message payload to the thread mailbox
GET/v1/threads/:id/mailboxList mailbox jobs for the thread
GET/v1/threads/:id/runsList runs for the thread
GET/v1/threads/:id/runs/latestGet the latest run for the thread

Runs

MethodPathDescription
GET/v1/runsList runs
POST/v1/runsStart a run and stream events over SSE
GET/v1/runs/:idGet a run record
POST/v1/runs/:id/inputsSubmit follow-up input messages as a background run on the same thread
POST/v1/runs/:id/cancelCancel a run by run ID
POST/v1/runs/:id/decisionSubmit a HITL decision by run ID

Config and capabilities

These endpoints are exposed by config_routes(). Read and schema routes require AppState to be constructed with a config store. Mutation routes additionally require a config runtime manager so writes can validate and publish a new registry snapshot. Without the required config wiring, the routes return 400 with config management API not enabled.

MethodPathDescription
GET/v1/capabilitiesList registered agents, tools, plugins, models, providers, and config namespaces
GET/v1/config/:namespaceList entries in a config namespace
POST/v1/config/:namespaceCreate an entry; the body must contain "id"
GET/v1/config/:namespace/:idGet one config entry
PUT/v1/config/:namespace/:idReplace a config entry
DELETE/v1/config/:namespace/:idDelete a config entry
GET/v1/config/:namespace/$schemaReturn the JSON Schema for a namespace
GET/v1/agentsConvenience alias for /v1/config/agents
GET/v1/agents/:idConvenience alias for /v1/config/agents/:id

Current built-in namespaces:

  • agents
  • models
  • providers
  • mcp-servers

AI SDK v6 routes

MethodPathDescription
POST/v1/ai-sdk/chatStart a chat run and stream protocol-encoded events
POST/v1/ai-sdk/threads/:thread_id/runsStart a thread-scoped AI SDK run
POST/v1/ai-sdk/agents/:agent_id/runsStart an agent-scoped AI SDK run
GET/v1/ai-sdk/chat/:thread_id/streamResume an SSE stream by thread ID
GET/v1/ai-sdk/threads/:thread_id/streamAlias for stream resume by thread ID
GET/v1/ai-sdk/threads/:thread_id/messagesList thread messages
POST/v1/ai-sdk/threads/:thread_id/cancelCancel the active or queued run on a thread
POST/v1/ai-sdk/threads/:thread_id/interruptInterrupt a thread (bump generation, supersede pending jobs, cancel active run)

AG-UI routes

MethodPathDescription
POST/v1/ag-ui/runStart an AG-UI run and stream AG-UI events
POST/v1/ag-ui/threads/:thread_id/runsStart a thread-scoped AG-UI run
POST/v1/ag-ui/agents/:agent_id/runsStart an agent-scoped AG-UI run
POST/v1/ag-ui/threads/:thread_id/interruptInterrupt a thread
GET/v1/ag-ui/threads/:id/messagesList thread messages

A2A routes

MethodPathDescription
GET/.well-known/agent-card.jsonGet the public/default agent card
POST/v1/a2a/message:sendSend a message to the public/default A2A agent
POST/v1/a2a/message:streamStreaming send over SSE
GET/v1/a2a/tasksList A2A tasks
GET/v1/a2a/tasks/:task_idGet task status
POST/v1/a2a/tasks/:task_id:cancelCancel a task
POST/v1/a2a/tasks/:task_id:subscribeSubscribe to task updates over SSE
POST/v1/a2a/tasks/:task_id/pushNotificationConfigsCreate a push notification config
GET/v1/a2a/tasks/:task_id/pushNotificationConfigsList push notification configs
GET/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_idGet a push notification config
DELETE/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_idDelete a push notification config
GET/v1/a2a/extendedAgentCardGet the extended agent card; returns 501 unless enabled
POST/v1/a2a/:tenant/message:sendSend a message to a tenant-scoped agent
POST/v1/a2a/:tenant/message:streamTenant-scoped streaming send
GET/v1/a2a/:tenant/tasksList tasks for a tenant-scoped agent
GET/v1/a2a/:tenant/tasks/:task_idGet tenant-scoped task status
POST/v1/a2a/:tenant/tasks/:task_id:cancelCancel a tenant-scoped task
POST/v1/a2a/:tenant/tasks/:task_id:subscribeSubscribe to tenant-scoped task updates
GET/v1/a2a/:tenant/extendedAgentCardGet the tenant-scoped extended agent card

MCP HTTP routes

MethodPathDescription
POST/v1/mcpMCP JSON-RPC request/response endpoint
GET/v1/mcpReserved for MCP server-initiated SSE; currently returns 405

Common query parameters

  • offset — number of items to skip
  • limit — maximum items to return, clamped to 1..=200
  • cursor — message-history pagination cursor; when provided it takes precedence over offset, and history responses return next_cursor
  • status — run filter: running, waiting, or done
  • visibility — message filter: omit for external-only, set to all to include internal messages

Error format

Most route groups return:

{ "error": "human-readable message" }

MCP routes return JSON-RPC error objects instead of the generic shape above.

Config

AgentSpec

The serializable agent definition. Can be loaded from JSON/YAML or constructed programmatically via builder methods.

pub struct AgentSpec {
    pub id: String,
    pub model: String,
    pub system_prompt: String,
    pub max_rounds: usize,                          // default: 16
    pub max_continuation_retries: usize,            // default: 2
    pub context_policy: Option<ContextWindowPolicy>,
    pub plugin_ids: Vec<String>,
    pub active_hook_filter: HashSet<String>,
    pub allowed_tools: Option<Vec<String>>,
    pub excluded_tools: Option<Vec<String>>,
    pub endpoint: Option<RemoteEndpoint>,
    pub delegates: Vec<String>,
    pub sections: HashMap<String, Value>,
    pub registry: Option<String>,
}

Crate path: awaken::registry_spec::AgentSpec (re-exported at awaken::AgentSpec)

Builder methods

AgentSpec::new(id) -> Self
    .with_model(model) -> Self
    .with_system_prompt(prompt) -> Self
    .with_max_rounds(n) -> Self
    .with_hook_filter(plugin_id) -> Self
    .with_config::<K>(config) -> Result<Self, StateError>
    .with_delegate(agent_id) -> Self
    .with_endpoint(endpoint) -> Self
    .with_section(key, value: Value) -> Self

Typed config access

/// Read a typed plugin config section. Returns default if missing.
fn config<K: PluginConfigKey>(&self) -> Result<K::Config, StateError>

/// Set a typed plugin config section.
fn set_config<K: PluginConfigKey>(&mut self, config: K::Config) -> Result<(), StateError>

ContextWindowPolicy

Controls context window management and auto-compaction.

#![allow(unused)]
fn main() {
#[derive(Default)] pub enum ContextCompactionMode { #[default] KeepRecentRawSuffix, CompactToSafeFrontier }
pub struct ContextWindowPolicy {
    pub max_context_tokens: usize,          // default: 200_000
    pub max_output_tokens: usize,           // default: 16_384
    pub min_recent_messages: usize,         // default: 10
    pub enable_prompt_cache: bool,          // default: true
    pub autocompact_threshold: Option<usize>,  // default: None
    pub compaction_mode: ContextCompactionMode, // default: KeepRecentRawSuffix
    pub compaction_raw_suffix_messages: usize,  // default: 2
}
}

ContextCompactionMode

#![allow(unused)]
fn main() {
pub enum ContextCompactionMode {
    KeepRecentRawSuffix,       // Keep N recent messages raw, compact the rest
    CompactToSafeFrontier,     // Compact everything up to safe frontier
}
}

InferenceOverride

Per-inference parameter override. All fields are Option; None means “use agent-level default”. Multiple plugins can emit overrides; fields merge with last-wins semantics.

#![allow(unused)]
fn main() {
pub enum ReasoningEffort { None, Low, Medium, High, Max, Budget(u32) }
pub struct InferenceOverride {
    pub model: Option<String>,
    pub fallback_models: Option<Vec<String>>,
    pub temperature: Option<f64>,
    pub max_tokens: Option<u32>,
    pub top_p: Option<f64>,
    pub reasoning_effort: Option<ReasoningEffort>,
}
}

Methods

fn is_empty(&self) -> bool
fn merge(&mut self, other: InferenceOverride)

ReasoningEffort

#![allow(unused)]
fn main() {
pub enum ReasoningEffort {
    None,
    Low,
    Medium,
    High,
    Max,
    Budget(u32),
}
}

PluginConfigKey trait

Binds a string key to a typed configuration struct at compile time.

pub trait PluginConfigKey: 'static + Send + Sync {
    const KEY: &'static str;
    type Config: Default + Clone + Serialize + DeserializeOwned
        + schemars::JsonSchema + Send + Sync + 'static;
}

Implementations register typed sections in AgentSpec::sections. Plugins read their configuration via agent_spec.config::<MyConfigKey>().

RemoteEndpoint

Configuration for agents running on external backends. Today Awaken ships the "a2a" backend; backend-specific settings live under options.

pub struct RemoteEndpoint {
    pub backend: String,
    pub base_url: String,
    pub auth: Option<RemoteAuth>,
    pub target: Option<String>,
    pub timeout_ms: u64,               // default: 300_000
    pub options: BTreeMap<String, Value>,
}

pub struct RemoteAuth {
    pub r#type: String,
    // backend-specific auth fields, e.g. { "token": "..." } for bearer
}

ServerConfig

HTTP server configuration. Used when the server feature is enabled.

pub struct ServerConfig {
    pub address: String,                   // default: "0.0.0.0:3000"
    pub sse_buffer_size: usize,            // default: 64
    pub replay_buffer_capacity: usize,     // default: 1024
    pub shutdown: ShutdownConfig,
    pub max_concurrent_requests: usize,    // default: 100
    pub a2a_extended_card_bearer_token: Option<String>,
}

pub struct ShutdownConfig {
    pub timeout_secs: u64,                 // default: 30
}

Crate path: awaken_server::app::ServerConfig

FieldTypeDefaultDescription
addressString"0.0.0.0:3000"Socket address the server binds to
sse_buffer_sizeusize64Maximum SSE channel buffer size per connection
replay_buffer_capacityusize1024Maximum SSE frames buffered per run for reconnection replay
max_concurrent_requestsusize100Maximum in-flight requests; excess requests receive 503
a2a_extended_card_bearer_tokenOption<String>NoneEnables authenticated GET /v1/a2a/extendedAgentCard when set
shutdown.timeout_secsu6430Seconds to wait for in-flight requests to drain before force-exiting

MailboxConfig

Configuration for the persistent run queue (mailbox). Controls lease timing, sweep/GC intervals, and retry behavior for failed jobs.

pub struct MailboxConfig {
    pub lease_ms: u64,                          // default: 30_000
    pub suspended_lease_ms: u64,                // default: 600_000
    pub lease_renewal_interval: Duration,       // default: 10s
    pub sweep_interval: Duration,               // default: 30s
    pub gc_interval: Duration,                  // default: 60s
    pub gc_ttl: Duration,                       // default: 24h
    pub default_max_attempts: u32,              // default: 5
    pub default_retry_delay_ms: u64,            // default: 250
    pub max_retry_delay_ms: u64,                // default: 30_000
}

Crate path: awaken_server::mailbox::MailboxConfig

FieldTypeDefaultDescription
lease_msu6430_000Lease duration in milliseconds for active runs
suspended_lease_msu64600_000Lease duration in milliseconds for suspended runs awaiting human input
lease_renewal_intervalDuration10sHow often the worker renews its lease on a running job
sweep_intervalDuration30sHow often to scan for expired leases and reclaim orphaned jobs
gc_intervalDuration60sHow often to run garbage collection for terminal (completed/failed) jobs
gc_ttlDuration24hHow long terminal jobs are retained before purging
default_max_attemptsu325Maximum delivery attempts before a job is dead-lettered
default_retry_delay_msu64250Base retry delay in milliseconds between attempts
max_retry_delay_msu6430_000Maximum retry delay in milliseconds for exponential backoff

LlmRetryPolicy

Policy for retrying failed LLM inference calls with exponential backoff and optional model fallback. Can be set per-agent via the "retry" section in AgentSpec.

pub struct LlmRetryPolicy {
    pub max_retries: u32,              // default: 2
    pub fallback_models: Vec<String>,  // default: []
    pub backoff_base_ms: u64,          // default: 500
}

Crate path: awaken_runtime::engine::retry::LlmRetryPolicy

FieldTypeDefaultDescription
max_retriesu322Maximum retry attempts after the initial call (0 = no retry)
fallback_modelsVec<String>[]Model names to try in order after the primary model exhausts retries
backoff_base_msu64500Base delay in milliseconds for exponential backoff; actual delay = min(base * 2^attempt, 8000ms). Set to 0 to disable backoff

AgentSpec integration

Register via the RetryConfigKey plugin config key ("retry" section):

use awaken_runtime::engine::retry::RetryConfigKey;

let spec = AgentSpec::new("my-agent")
    .with_config::<RetryConfigKey>(LlmRetryPolicy {
        max_retries: 3,
        fallback_models: vec!["claude-sonnet-4-20250514".into()],
        backoff_base_ms: 1000,
    })?;

CircuitBreakerConfig

Per-model circuit breaker configuration. Prevents cascading failures by short-circuiting requests to models that have experienced repeated consecutive failures. After a cooldown the circuit transitions to half-open, allowing limited probe requests before fully closing on success.

pub struct CircuitBreakerConfig {
    pub failure_threshold: u32,    // default: 5
    pub cooldown: Duration,        // default: 30s
    pub half_open_max: u32,        // default: 1
}

Crate path: awaken_runtime::engine::circuit_breaker::CircuitBreakerConfig

FieldTypeDefaultDescription
failure_thresholdu325Consecutive failures before the circuit opens and rejects requests
cooldownDuration30sHow long the circuit stays open before transitioning to half-open
half_open_maxu321Maximum probe requests allowed in the half-open state before the circuit reopens on failure or closes on success

Feature flags and their effects

FlagRuntime behavior
permissionRegisters the permission plugin; tools can be gated with HITL approval
observabilityRegisters the observability plugin; emits traces and metrics
mcpEnables MCP tool bridge; tools from MCP servers are auto-registered
skillsEnables the skills subsystem for reusable agent capabilities
serverBuilds the HTTP server with SSE streaming and protocol adapters
generative-uiEnables generative UI component streaming to frontends

Custom plugin configuration

Plugins declare typed configuration sections using the PluginConfigKey trait, which binds a string key to a Rust struct at compile time:

pub trait PluginConfigKey: 'static + Send + Sync {
    const KEY: &'static str;               // section name in AgentSpec.sections
    type Config: Default + Clone + Serialize + DeserializeOwned
        + schemars::JsonSchema + Send + Sync + 'static;
}

Declaring schemas for validation

Plugins override config_schemas() to return JSON Schemas generated from their config structs. The resolve pipeline (Stage 2) validates every AgentSpec.sections entry against these schemas before any hook runs.

fn config_schemas(&self) -> Vec<ConfigSchema> {
    vec![ConfigSchema {
        key: RateLimitConfigKey::KEY,
        json_schema: schemars::schema_for!(RateLimitConfig),
    }]
}

Reading config at runtime

Plugins read their typed config via agent_spec.config::<K>(). If the section is absent, the Default impl is returned.

let cfg = ctx.agent_spec().config::<RateLimitConfigKey>()?;

Worked example

use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
use awaken::PluginConfigKey;

#[derive(Debug, Clone, Default, Serialize, Deserialize, JsonSchema)]
pub struct RateLimitConfig {
    pub max_calls_per_step: u32,   // default: 0 (unlimited)
    pub cooldown_ms: u64,          // default: 0
}

pub struct RateLimitConfigKey;

impl PluginConfigKey for RateLimitConfigKey {
    const KEY: &'static str = "rate_limit";
    type Config = RateLimitConfig;
}

// In plugin register():
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
    r.register_hook(Phase::BeforeToolExecute, RateLimitHook);
    Ok(())
}

fn config_schemas(&self) -> Vec<ConfigSchema> {
    vec![ConfigSchema {
        key: RateLimitConfigKey::KEY,
        json_schema: schemars::schema_for!(RateLimitConfig),
    }]
}

// In a hook:
let cfg = ctx.agent_spec().config::<RateLimitConfigKey>()?;
if cfg.max_calls_per_step > 0 { /* enforce limit */ }

Validation behavior

  • Section present but invalid: resolve fails with a schema validation error.
  • Section present but unclaimed: a warning is logged (possible typo or removed plugin).
  • Section absent: allowed; the plugin receives Config::default().

Errors

All error types use thiserror derives and implement std::error::Error + Display.

StateError

Errors from state management operations. Defined in awaken-contract.

#![allow(unused)]
fn main() {
use awaken::Phase;
pub enum StateError {
    RevisionConflict { expected: u64, actual: u64 },
    MutationBaseRevisionMismatch { left: u64, right: u64 },
    PluginAlreadyInstalled { name: String },
    PluginNotInstalled { type_name: &'static str },
    KeyAlreadyRegistered { key: String },
    UnknownKey { key: String },
    KeyDecode { key: String, message: String },
    KeyEncode { key: String, message: String },
    HandlerAlreadyRegistered { key: String },
    EffectHandlerAlreadyRegistered { key: String },
    PhaseRunLoopExceeded { phase: Phase, max_rounds: usize },
    UnknownScheduledActionHandler { key: String },
    UnknownEffectHandler { key: String },
    ParallelMergeConflict { key: String },
    ToolAlreadyRegistered { tool_id: String },
    Cancelled,
}
}

Crate path: awaken::StateError

StateError implements Clone and PartialEq.

ToolError

Errors returned from Tool::validate_args or Tool::execute. A ToolError aborts the tool call entirely (as opposed to ToolResult::error, which sends the failure back to the LLM).

#![allow(unused)]
fn main() {
pub enum ToolError {
    InvalidArguments(String),
    ExecutionFailed(String),
    Denied(String),
    NotFound(String),
    Internal(String),
}
}

Crate path: awaken::contract::tool::ToolError

BuildError

Errors from AgentRuntimeBuilder::build().

#![allow(unused)]
fn main() {
use awaken::StateError;
struct DiscoveryError;
pub enum BuildError {
    State(StateError),
    AgentRegistryConflict(String),
    ToolRegistryConflict(String),
    ModelRegistryConflict(String),
    ProviderRegistryConflict(String),
    PluginRegistryConflict(String),
    ValidationFailed(String),
    DiscoveryFailed(DiscoveryError),     // requires feature "a2a"
}
}

Crate path: awaken::BuildError

BuildError converts from StateError via From.

RuntimeError

Errors from agent runtime operations (resolving agents, starting runs).

#![allow(unused)]
fn main() {
use awaken::StateError;
pub enum RuntimeError {
    State(StateError),
    ThreadAlreadyRunning { thread_id: String },
    AgentNotFound { agent_id: String },
    ResolveFailed { message: String },
}
}

Crate path: awaken::RuntimeError

RuntimeError converts from StateError via From. Implements Clone and PartialEq.

InferenceExecutionError

Errors from the LLM execution layer.

#![allow(unused)]
fn main() {
pub enum InferenceExecutionError {
    Provider(String),
    RateLimited(String),
    Timeout(String),
    Cancelled,
}
}

Crate path: awaken::contract::executor::InferenceExecutionError

StorageError

Errors returned by ThreadStore, RunStore, and ThreadRunStore operations.

#![allow(unused)]
fn main() {
pub enum StorageError {
    NotFound(String),
    AlreadyExists(String),
    VersionConflict { expected: u64, actual: u64 },
    Io(String),
    Serialization(String),
}
}

Crate path: awaken::contract::storage::StorageError

ResolveError

Errors from the agent resolution pipeline (resolving AgentSpec to a runnable ResolvedAgent).

#![allow(unused)]
fn main() {
use awaken::StateError;
pub enum ResolveError {
    AgentNotFound(String),
    ModelNotFound(String),
    ProviderNotFound(String),
    PluginNotFound(String),
    InvalidPluginConfig { plugin: String, key: String, message: String },
    RemoteAgentNotDirectlyRunnable(String),
    ToolIdConflict { tool_id: String, source_a: String, source_b: String },
    EnvBuild(StateError),
}
}

Crate path: awaken::registry::resolve::ResolveError

UnknownKeyPolicy

Controls behavior when encountering an unknown state key during deserialization.

#![allow(unused)]
fn main() {
pub enum UnknownKeyPolicy {
    Error,
    Skip,
}
}

Crate path: awaken::UnknownKeyPolicy

AI SDK v6 Protocol

The AI SDK v6 adapter translates Awaken’s internal AgentEvent stream into the Vercel AI SDK v6 UI Message Stream format. This allows any AI SDK-compatible frontend (useChat, useAssistant) to consume agent output without custom parsing.

Endpoint

POST /v1/ai-sdk/chat

Request Body

{
  "messages": [{ "role": "user", "content": "Hello" }],
  "threadId": "optional-thread-id",
  "agentId": "optional-agent-id"
}
FieldTypeRequiredDescription
messagesAiSdkMessage[]yesChat messages. Content may be a string or an array of content parts.
threadIdstringnoExisting thread to continue. Omit to create a new thread.
agentIdstringnoTarget agent. Uses the default agent when omitted.

Response

SSE stream (text/event-stream). Each line is a JSON-encoded UIStreamEvent.

Auxiliary Routes

RouteMethodDescription
/v1/ai-sdk/streams/:run_idGETReconnect to an active run’s SSE stream.
/v1/ai-sdk/runs/:run_id/streamGETAlias for stream reconnect.
/v1/ai-sdk/threads/:id/messagesGETRetrieve thread message history.

Event Mapping

The AiSdkEncoder is a stateful transcoder that converts AgentEvent variants into UIStreamEvent variants. It tracks open text blocks and reasoning blocks across tool-call boundaries.

AgentEventUIStreamEvent(s)
RunStartMessageStart + Data("run-info", ...)
TextDeltaTextStart (if block not open) + TextDelta
ReasoningDeltaReasoningStart (if block not open) + ReasoningDelta
ReasoningEncryptedValueReasoningStart (if not open) + ReasoningDelta
ToolCallStartClose open text/reasoning blocks, then ToolCallStart
ToolCallDeltaToolCallDelta
ToolCallDoneToolCallEnd
StepStart(no direct mapping)
StepEnd(no direct mapping)
InferenceCompleteData("inference-complete", ...)
MessagesSnapshotData("messages-snapshot", ...)
StateSnapshotData("state-snapshot", ...)
StateDeltaData("state-delta", ...)
ActivitySnapshotData("activity-snapshot", ...)
ActivityDeltaData("activity-delta", ...)
RunFinishClose open blocks, Data("finish", ...), Finish

UIStreamEvent Types

The wire format uses the type field as a discriminant, serialized in kebab-case:

  • start – message lifecycle start, carries optional messageId and messageMetadata
  • text-start, text-delta, text-end – text block lifecycle with content ID
  • reasoning-start, reasoning-delta, reasoning-end – reasoning block lifecycle
  • tool-call-start, tool-call-delta, tool-call-end – tool call lifecycle
  • data – arbitrary named data events (state snapshots, activity, inference metadata)
  • finish – terminal event with finish reason and usage summary

Text Block Lifecycle

The encoder automatically manages text block open/close boundaries:

  1. First TextDelta opens a text block (TextStart).
  2. Subsequent deltas append to the open block.
  3. When a ToolCallStart arrives, the encoder closes any open text or reasoning block before emitting tool events.
  4. After tool execution completes, new text deltas open a fresh block with an incremented ID.

This ensures the frontend receives well-formed block boundaries even though the runtime emits flat delta events.

AG-UI Protocol

The AG-UI adapter translates Awaken’s AgentEvent stream into the AG-UI (CopilotKit) event format. This enables CopilotKit frontends to drive Awaken agents with no custom adapter code.

Endpoint

POST /v1/ag-ui/run

Request Body

{
  "threadId": "optional-thread-id",
  "agentId": "optional-agent-id",
  "messages": [{ "role": "user", "content": "Hello" }],
  "context": {}
}
FieldTypeRequiredDescription
messagesAgUiMessage[]yesChat messages with role and content strings.
threadIdstringnoExisting thread. Omit to create a new thread.
agentIdstringnoTarget agent. Uses the default agent when omitted.
contextobjectnoCopilotKit context forwarding (reserved).

Response

SSE stream (text/event-stream). Each frame is a JSON-encoded AG-UI Event.

Auxiliary Routes

RouteMethodDescription
/v1/ag-ui/threads/:id/messagesGETRetrieve thread message history.

Event Mapping

The AgUiEncoder is a stateful transcoder that manages text message and step lifecycles.

AgentEventAG-UI Event(s)
RunStartRUN_STARTED
TextDeltaTEXT_MESSAGE_START (if not open) + TEXT_MESSAGE_CONTENT
ReasoningDeltaREASONING_MESSAGE_START (if not open) + REASONING_MESSAGE_CONTENT
ToolCallStartClose text/reasoning, STEP_STARTED, TOOL_CALL_START
ToolCallDeltaTOOL_CALL_ARGS
ToolCallDoneTOOL_CALL_END, STEP_FINISHED
StateSnapshotSTATE_SNAPSHOT
StateDeltaSTATE_DELTA
RunFinish (success)Close text/reasoning, RUN_FINISHED
RunFinish (error)Close text/reasoning, RUN_ERROR

AG-UI Event Types

Events use an uppercase type discriminant:

  • RUN_STARTED / RUN_FINISHED / RUN_ERROR – run lifecycle
  • TEXT_MESSAGE_START / TEXT_MESSAGE_CONTENT / TEXT_MESSAGE_END – assistant text
  • REASONING_MESSAGE_START / REASONING_MESSAGE_CONTENT / REASONING_MESSAGE_END – reasoning trace
  • STEP_STARTED / STEP_FINISHED – step boundaries (wrapping tool calls)
  • TOOL_CALL_START / TOOL_CALL_ARGS / TOOL_CALL_END – tool call lifecycle
  • STATE_SNAPSHOT / STATE_DELTA – shared state synchronization
  • MESSAGES_SNAPSHOT – full thread message history

All events carry a BaseEvent with optional timestamp and rawEvent fields.

Roles

AG-UI messages use lowercase role strings: system, user, assistant, tool.

Text Message Lifecycle

  1. First TextDelta emits TEXT_MESSAGE_START followed by TEXT_MESSAGE_CONTENT.
  2. Subsequent deltas emit only TEXT_MESSAGE_CONTENT.
  3. A ToolCallStart or RunFinish closes the open message with TEXT_MESSAGE_END.

Reasoning messages follow the same pattern with REASONING_MESSAGE_* events.

A2A Protocol

The Agent-to-Agent (A2A) adapter implements the A2A protocol for remote agent discovery, task delegation, and inter-agent communication.

Feature gate: server

Endpoints

RouteMethodDescription
/.well-known/agent-card.jsonGETDiscovery endpoint for the public/default agent card.
/v1/a2a/message:sendPOSTSend a message to the public/default A2A agent. Returns a task wrapper.
/v1/a2a/message:streamPOSTStreaming send over SSE.
/v1/a2a/tasksGETList A2A tasks.
/v1/a2a/tasks/:task_idGETPoll task status by ID.
/v1/a2a/tasks/:task_id:cancelPOSTCancel a running task.
/v1/a2a/tasks/:task_id:subscribePOSTSubscribe to task updates over SSE.
/v1/a2a/tasks/:task_id/pushNotificationConfigsPOSTCreate a push notification config.
/v1/a2a/tasks/:task_id/pushNotificationConfigsGETList push notification configs.
/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_idGET / DELETERead or delete a push notification config.
/v1/a2a/extendedAgentCardGETExtended agent card. Returns 501 unless capabilities.extendedAgentCard=true.

Tenant-scoped variants mirror the same interface under /v1/a2a/:tenant/..., for example /v1/a2a/research/message:send and /v1/a2a/research/tasks/:task_id.

Agent Card

The discovery endpoint returns an AgentCard describing the exposed interface and capabilities:

{
  "name": "My Agent",
  "description": "A helpful assistant",
  "supportedInterfaces": [
    {
      "url": "https://example.com/v1/a2a",
      "protocolBinding": "HTTP+JSON",
      "protocolVersion": "1.0"
    }
  ],
  "version": "1.0.0",
  "capabilities": {
    "streaming": true,
    "pushNotifications": true,
    "stateTransitionHistory": false,
    "extendedAgentCard": false
  },
  "defaultInputModes": ["text/plain"],
  "defaultOutputModes": ["text/plain"],
  "skills": [
    {
      "id": "general",
      "name": "General Q&A",
      "description": "Answer general questions",
      "tags": ["qa"],
      "inputModes": ["text/plain"],
      "outputModes": ["text/plain"]
    }
  ]
}

Agent cards are derived from registered AgentSpec entries. The top-level legacy url/id fields are not emitted.

Message Send

{
  "message": {
    "taskId": "optional-client-provided-id",
    "contextId": "optional-client-provided-id",
    "messageId": "msg-123",
    "role": "ROLE_USER",
    "parts": [{ "text": "Summarize this document" }]
  },
  "configuration": {
    "returnImmediately": true
  }
}

The server maps A2A tasks to Awaken thread/mailbox execution. The response uses the v1 task wrapper shape:

{
  "task": {
    "id": "optional-client-provided-id",
    "contextId": "optional-client-provided-id",
    "status": {
      "state": "TASK_STATE_SUBMITTED"
    }
  }
}

If returnImmediately is omitted or false, the adapter waits for a terminal/interrupted task state before responding.

Task Status

GET /v1/a2a/tasks/:task_id returns a Task resource:

{
  "id": "abc-123",
  "contextId": "abc-123",
  "status": {
    "state": "TASK_STATE_COMPLETED",
    "message": {
      "messageId": "msg-response",
      "role": "ROLE_AGENT",
      "parts": [{ "text": "..." }]
    }
  },
  "history": []
}

Task states follow the v1 enum names such as TASK_STATE_SUBMITTED, TASK_STATE_WORKING, TASK_STATE_COMPLETED, TASK_STATE_FAILED, and TASK_STATE_CANCELED.

Optional capability defaults

Awaken currently enables these A2A capabilities by default:

  • streaming = true
  • pushNotifications = true

extendedAgentCard remains opt-in and is enabled only when ServerConfig.a2a_extended_card_bearer_token is configured. When disabled, the extended card endpoints return spec-shaped unsupported errors.

Remote Agent Delegation

Awaken agents can delegate to remote A2A agents via AgentTool::remote(). The A2aBackend sends a message:send request to the remote endpoint, reads the returned task.id, then polls /tasks/:task_id for completion. From the LLM’s perspective, this is a regular tool call — the A2A transport is transparent.

Configuration for remote agents is declared in AgentSpec. RemoteEndpoint is generic, and A2A uses backend: "a2a":

{
  "id": "remote-researcher",
  "endpoint": {
    "backend": "a2a",
    "base_url": "https://remote-agent.example.com/v1/a2a",
    "auth": { "type": "bearer", "token": "..." },
    "target": "researcher",
    "timeout_ms": 300000,
    "options": {
      "poll_interval_ms": 1000
    }
  }
}

Agents with an endpoint field are resolved as remote backend agents. Today the built-in delegate backend is A2A. Agents without endpoint run locally.

Cancellation

Cooperative cancellation token used to interrupt agent runs, streaming loops, and long-running operations.

CancellationToken

A cloneable handle backed by a shared AtomicBool and a tokio::sync::Notify. All clones share the same cancellation state – cancelling any clone cancels all of them.

use awaken::CancellationToken;

let token = CancellationToken::new();

Crate path: awaken_contract::cancellation::CancellationToken

Methods

/// Create a new uncancelled token.
pub fn new() -> Self

/// Signal cancellation. Wakes all async waiters immediately.
pub fn cancel(&self)

/// Check if cancellation has been requested (synchronous poll).
pub fn is_cancelled(&self) -> bool

/// Wait until cancellation is signalled. Resolves immediately if already cancelled.
pub async fn cancelled(&self)

Traits

  • Clone – clones share the same underlying state
  • Default – creates an uncancelled token (equivalent to new())

Synchronous polling

Use is_cancelled() to check cancellation from synchronous code or tight loops:

let token = CancellationToken::new();

while !token.is_cancelled() {
    // do work
}

Async waiting with tokio::select!

Use cancelled() in a tokio::select! branch to interrupt async operations without polling:

let token = CancellationToken::new();

tokio::select! {
    result = some_async_work() => {
        // work completed before cancellation
    }
    _ = token.cancelled() => {
        // cancellation was signalled
    }
}

The cancelled() future resolves immediately if the token is already cancelled, so there is no race between checking and waiting.

Cooperative semantics

Cancellation is cooperative. Calling cancel() sets a flag and wakes async waiters, but does not abort any running task. Code must check is_cancelled() or select! on cancelled() to observe and respond to cancellation.

Key properties:

  • Idempotent – calling cancel() multiple times is safe and has no additional effect.
  • Shared – all clones observe the same cancellation state. Cancelling from any clone is visible to all others.
  • Ordering – uses SeqCst ordering on the atomic flag, so cancellation is immediately visible across threads.
  • Immediate wakecancel() calls Notify::notify_waiters(), waking all tasks blocked on cancelled() without waiting for the next poll.

Runtime usage

The runtime passes a CancellationToken into each agent run. It is used to:

  • Interrupt streaming inference mid-response when the caller requests cancellation.
  • Stop the agent loop between inference cycles.
  • Propagate cancellation from external transports (HTTP, SSE) into the runtime.

A typical streaming loop checks cancellation alongside each chunk:

let token = CancellationToken::new();
let clone = token.clone();

tokio::select! {
    _ = async {
        while let Some(chunk) = stream.next().await {
            // process chunk
        }
    } => {}
    _ = clone.cancelled() => {
        // stop processing, clean up
    }
}

Auto-cancellation on new message

When a new message is submitted to a thread that already has an active run (e.g. a run suspended while waiting for tool approval), the runtime automatically cancels the old run before starting the new one.

This prevents the ThreadAlreadyRunning error and avoids infinite retry loops. The sequence is:

  1. Mailbox::submit() detects an active run on the thread.
  2. Calls AgentRuntime::cancel_and_wait_by_thread() which signals the CancellationToken and waits (up to 5 seconds) for the RunSlotGuard to drop and free the thread slot.
  3. The old run emits RunFinish with TerminationReason::Cancelled.
  4. Before the new run starts inference, strip_unpaired_tool_calls() removes any orphaned tool calls from the message history that were left by the cancelled run (assistant messages with tool_calls that have no matching Tool role response).

This ensures clean handoff between runs without leaving dangling state that would confuse the LLM.

Key Files

  • crates/awaken-runtime/src/cancellation.rsCancellationToken implementation
  • crates/awaken-runtime/src/runtime/agent_runtime/active_registry.rs – run tracking with completion notification
  • crates/awaken-runtime/src/runtime/agent_runtime/runner.rsstrip_unpaired_tool_calls()

Tool Execution Modes

ToolExecutionMode controls how the runtime executes tool calls that the LLM requests in a single inference step.

Crate path: awaken::contract::executor::ToolExecutionMode

ToolExecutionMode

#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
pub enum ToolExecutionMode {
    #[default]
    Sequential,
    ParallelBatchApproval,
    ParallelStreaming,
}

The default is Sequential.

Modes

Sequential

Executes tool calls one at a time in the order the LLM returned them.

  • The state snapshot is refreshed between calls (requires_incremental_state returns true), so each tool sees the effects of the previous one.
  • Stops at the first suspension. If tool call 2 of 4 suspends, tool calls 3 and 4 are not executed. Failures do not stop execution.
  • Simplest mode. Use when tool calls have data dependencies or when ordering matters.
// SequentialToolExecutor runs calls in order, stopping at first suspension.
let executor = SequentialToolExecutor;
let results = executor.execute(&tools, &calls, &ctx).await?;

ParallelBatchApproval

Executes all tool calls concurrently. All tools see the same frozen state snapshot.

  • Suspension decisions are replayed using DecisionReplayPolicy::BatchAllSuspended: the runtime waits until every suspended call has a decision before replaying any of them.
  • Enforces parallel patch conflict checks (requires_conflict_check returns true).
  • Does not stop on suspension or failure; all calls run to completion.
  • Use when tool calls are independent and you want an all-or-nothing approval gate for HITL workflows.
let executor = ParallelToolExecutor::batch_approval();
// decision_replay_policy() == DecisionReplayPolicy::BatchAllSuspended

ParallelStreaming

Executes all tool calls concurrently with the same frozen state snapshot.

  • Suspension decisions are replayed using DecisionReplayPolicy::Immediate: each decision is replayed as soon as it arrives, without waiting for the others.
  • Enforces parallel patch conflict checks.
  • Does not stop on suspension or failure; all calls run to completion.
  • Use when tool calls are independent and you want the fastest end-to-end completion without batching approval.
let executor = ParallelToolExecutor::streaming();
// decision_replay_policy() == DecisionReplayPolicy::Immediate

Comparison

BehaviorSequentialParallelBatchApprovalParallelStreaming
Execution orderOne at a timeAll concurrentlyAll concurrently
State freshnessRefreshed between callsFrozen snapshotFrozen snapshot
Stops on suspensionYes (first suspension)NoNo
Stops on failureNoNoNo
Decision replayN/ABatch (all at once)Immediate (one by one)
Conflict checksNoYesYes

Executor trait

Both SequentialToolExecutor and ParallelToolExecutor implement the ToolExecutor trait:

#[async_trait]
pub trait ToolExecutor: Send + Sync {
    /// Execute tool calls and return results.
    async fn execute(
        &self,
        tools: &HashMap<String, Arc<dyn Tool>>,
        calls: &[ToolCall],
        base_ctx: &ToolCallContext,
    ) -> Result<Vec<ToolExecutionResult>, ToolExecutorError>;

    /// Strategy name for logging.
    fn name(&self) -> &'static str;

    /// Whether the executor needs state refreshed between individual tool calls.
    fn requires_incremental_state(&self) -> bool { false }
}

The name() values are "sequential", "parallel_batch_approval", and "parallel_streaming".

DecisionReplayPolicy

Controls when resume decisions for suspended tool calls are replayed into the execution pipeline. Only relevant for parallel modes.

pub enum DecisionReplayPolicy {
    /// Replay each resolved suspended call as soon as its decision arrives.
    Immediate,
    /// Replay only when all currently suspended calls have decisions.
    BatchAllSuspended,
}

Key Files

  • crates/awaken-contract/src/contract/executor.rsToolExecutionMode enum
  • crates/awaken-runtime/src/execution/executor.rsSequentialToolExecutor, ParallelToolExecutor, ToolExecutor trait

Architecture

Awaken is organized around one runtime core plus three surrounding surfaces: contract types, server/storage adapters, and optional extensions. The important distinction is not just crate boundaries, but where decisions are made.

Application assembly
  register tools / models / providers / plugins / AgentSpec
        |
        v
AgentRuntime
  resolve AgentSpec -> ResolvedAgent
  build ExecutionEnv from plugins
  run the phase loop
  expose cancel / decision control for active runs
        |
        v
Server + storage surfaces
  HTTP routes, mailbox, SSE replay, protocol adapters,
  thread/run persistence, profile storage

Contract layerawaken-contract defines the shared types used everywhere: AgentSpec, ModelSpec, ProviderSpec, Tool, AgentEvent, transport traits, and the typed state model. This is the vocabulary that the rest of the system speaks.

Runtime coreawaken-runtime is the orchestration layer. It resolves agent IDs to fully wired configurations (ResolvedAgent), builds an ExecutionEnv from plugins, manages active runs, and delegates execution to the loop runner plus phase engine.

Server and persistence surfacesawaken-server turns the runtime into HTTP and SSE endpoints, mailbox-backed background execution, and protocol adapters. awaken-stores provides concrete persistence backends for threads and runs. awaken-ext-* crates extend the runtime at phase and tool boundaries without changing the core loop.

Request Sequence

The following diagram shows a representative request flowing through the system:

sequenceDiagram
    participant Client
    participant Server
    participant Runtime
    participant LLM
    participant Tool

    Client->>Server: POST /v1/ai-sdk/chat
    Server->>Runtime: RunRequest (agent_id, thread_id, messages)
    Runtime->>Runtime: Resolve agent (AgentSpec -> ResolvedAgent)
    Runtime->>Runtime: Load thread history
    Runtime->>Runtime: RunStart phase
    loop Step loop
        Runtime->>Runtime: StepStart phase
        Runtime->>Runtime: BeforeInference phase
        Runtime->>LLM: Inference request (messages + tools)
        LLM-->>Runtime: Response (text + tool_calls)
        Runtime->>Runtime: AfterInference phase
        opt Tool calls present
            Runtime->>Runtime: BeforeToolExecute phase
            Runtime->>Tool: execute(args, ctx)
            Tool-->>Runtime: ToolResult
            Runtime->>Runtime: AfterToolExecute phase
        end
        Runtime->>Runtime: StepEnd phase (checkpoint)
    end
    Runtime->>Runtime: RunEnd phase
    Runtime-->>Server: AgentEvent stream
    Server-->>Client: SSE events (protocol-specific encoding)

Phase-Driven Execution Loop

Every run proceeds through a fixed sequence of phases. Plugins register hooks that run at each phase boundary, giving them control over inference parameters, tool execution, state mutations, and termination logic.

RunStart -> [StepStart -> BeforeInference -> AfterInference
             -> BeforeToolExecute -> AfterToolExecute -> StepEnd]* -> RunEnd

The step loop repeats until one of these conditions fires:

  • The LLM returns a response with no tool calls (NaturalEnd).
  • A plugin or stop condition requests termination (Stopped, BehaviorRequested).
  • A tool call suspends waiting for external input (Suspended).
  • The run is cancelled externally (Cancelled).
  • An error occurs (Error).

At each phase boundary, the loop checks the cancellation token and the run lifecycle state before proceeding.

Repository Map

awaken
├─ awaken-contract
│  ├─ registry specs
│  ├─ tool / executor / event / transport contracts
│  └─ state model
├─ awaken-runtime
│  ├─ builder + registries + resolve pipeline
│  ├─ AgentRuntime control plane
│  ├─ loop_runner + phase engine
│  ├─ execution / context / policies / profile
│  └─ runtime extensions (handoff, local A2A, background)
├─ awaken-server
│  ├─ routes + config API + mailbox + services
│  ├─ protocols: ai_sdk_v6 / ag_ui / a2a / mcp / acp-stdio
│  └─ transport: SSE relay / replay buffer / transcoder
├─ awaken-stores
└─ awaken-ext-*

Design Intent

Three principles guide the architecture:

Snapshot isolation – Phase hooks never see partially applied state. They read from an immutable snapshot and write to a MutationBatch. The batch is applied atomically after all hooks for a phase have converged. This eliminates data races between concurrent hooks and makes hook execution order irrelevant for correctness.

Append-style persistence – Thread messages are append-only. State is checkpointed at step boundaries. This makes it possible to replay a run from any checkpoint and produces a deterministic audit trail.

Transport independence – The runtime emits AgentEvent values through an EventSink trait. Protocol adapters (AiSdkEncoder, AgUiEncoder) transcode these events into wire formats. The runtime has no knowledge of HTTP, SSE, or any specific protocol. Adding a new protocol means implementing a new encoder – the runtime does not change.

See Also

Agent Resolution

When runtime.run(request) is called, the agent_id in the request must be resolved into a fully wired ResolvedAgent – a struct that holds live references to an LLM executor, tools, plugins, and an execution environment. This resolution happens on every resolve() call; nothing is cached or shared between runs. This page describes the three-stage resolution pipeline and the builder that feeds it.

Pipeline Overview

Resolution is a pure function: (RegistrySet, agent_id) -> ResolvedAgent. It proceeds through three sequential stages:

flowchart LR
    subgraph Stage1["Stage 1: Lookup"]
        A1[Fetch AgentSpec] --> A2[Resolve ModelSpec]
        A2 --> A3[Get LlmExecutor]
        A3 --> A4{Retry config?}
        A4 -- yes --> A5[Wrap in RetryingExecutor]
        A4 -- no --> A6[Use executor directly]
    end

    subgraph Stage2["Stage 2: Plugin Pipeline"]
        B1[Resolve plugin IDs] --> B2[Inject default plugins]
        B2 --> B3{context_policy set?}
        B3 -- yes --> B4[Add CompactionPlugin + ContextTransformPlugin]
        B3 -- no --> B5[Skip]
        B4 --> B6[Validate config sections]
        B5 --> B6
        B6 --> B7[Build ExecutionEnv]
    end

    subgraph Stage3["Stage 3: Tool Pipeline"]
        C1[Collect global tools] --> C2[Merge delegate agent tools]
        C2 --> C3[Merge plugin-registered tools]
        C3 --> C4[Check for ID conflicts]
        C4 --> C5[Apply allowed/excluded filters]
    end

    Stage1 --> Stage2 --> Stage3

Any failure at any stage produces a ResolveError and aborts. The pipeline never returns a partial result.

Stage 1: Lookup

The first stage fetches the raw data from registries:

  1. AgentSpec – looked up from AgentSpecRegistry by agent_id. If the spec has an endpoint field (remote backend agent), resolution fails with RemoteAgentNotDirectlyRunnable – remote agents can only be used as delegates, not run directly.

  2. ModelSpec – the spec’s model field (a string ID like "gpt-4") is resolved through ModelRegistry into a ModelSpec, which maps it to a provider ID and an actual API model name (for example, provider "openai", model "gpt-4o").

  3. LlmExecutor – the provider ID from the model entry is resolved through ProviderRegistry to get a live LlmExecutor instance.

  4. Retry decoration – if the agent spec contains a RetryConfigKey section with max_retries > 0 or non-empty fallback_models, the executor is wrapped in a RetryingExecutor decorator.

Stage 2: Plugin Pipeline

The second stage assembles the plugin chain and builds the execution environment.

Plugin resolution

Plugins listed in AgentSpec.plugin_ids are resolved by ID from PluginSource. A missing plugin produces ResolveError::PluginNotFound.

Default plugin injection

After resolving user-declared plugins, the pipeline injects runtime-required default plugins. These are always present regardless of agent configuration:

  • LoopActionHandlersPlugin – registers the core action handlers that the runtime loop uses to process tool calls, emit events, and manage step transitions. Without this plugin, the loop cannot function.

  • MaxRoundsPlugin – enforces the max_rounds stop condition configured on the agent spec. Injected with the spec’s max_rounds value. Prevents runaway loops.

Conditional plugins

These plugins are added only when AgentSpec.context_policy is set:

  • CompactionPlugin – manages context window compaction (summarization of old messages when the context grows too large). Created with the CompactionConfigKey section from the spec, falling back to defaults if absent.

  • ContextTransformPlugin – applies context window policy transforms (token counting, truncation, prompt caching) before each inference request. Created with the context_policy value.

Building ExecutionEnv

After the plugin list is finalized, ExecutionEnv::from_plugins() calls each plugin’s register() method with a PluginRegistrar. Plugins use the registrar to declare:

  • Phase hooks (per-phase callbacks)
  • Scheduled action handlers
  • Effect handlers
  • Request transforms
  • State key registrations
  • Tools

The result is an ExecutionEnv – see ExecutionEnv below.

Config validation

Plugins can declare config_schemas(), returning a list of ConfigSchema entries. Each entry associates a section key with a JSON Schema. During resolution, every declared schema is validated against the corresponding entry in AgentSpec.sections:

  • Section present – validated against the JSON Schema. Failure produces ResolveError::InvalidPluginConfig.
  • Section absent – allowed. Plugins are expected to use sensible defaults.
  • Section present but unclaimed – no plugin declared a schema for it. The pipeline logs a warning (possible typo in configuration).

Stage 3: Tool Pipeline

The third stage collects tools from all sources and produces the final tool set.

Tool sources

Tools are merged in this order:

  1. Global tools – all tools registered in ToolRegistry via the builder (e.g., builder.with_tool("search", search_tool)).

  2. Delegate agent tools – for each agent ID in AgentSpec.delegates, the pipeline creates an AgentTool. If the delegate has an endpoint (remote), the pipeline selects the configured remote backend. Today the built-in remote delegate backend is A2A; local delegates still use a resolver-backed local tool. Delegate tools require the a2a feature flag; without it, delegates are silently ignored with a warning.

  3. Plugin-registered tools – tools declared by plugins during register(), stored in ExecutionEnv.tools.

Conflict detection

If a plugin-registered tool has the same ID as a global tool, resolution fails with ResolveError::ToolIdConflict. This is intentional – silent overwriting would be a source of hard-to-debug issues.

Filtering

After merging, the spec’s allowed_tools and excluded_tools fields are applied:

  • allowed_tools = None – all tools are kept.
  • allowed_tools = Some(list) – only tools whose ID appears in the list are kept. Everything else is dropped.
  • excluded_tools – any tool whose ID appears in this list is removed, even if it was in the allow list.

ExecutionEnv

ExecutionEnv is the per-resolve product of the plugin pipeline. It is not global or shared – each resolve() call builds a fresh one. Its contents:

FieldTypePurpose
phase_hooksHashMap<Phase, Vec<TaggedPhaseHook>>Hooks invoked at each phase boundary
scheduled_action_handlersHashMap<String, ScheduledActionHandlerArc>Named handlers for scheduled/deferred actions
effect_handlersHashMap<String, EffectHandlerArc>Named handlers for side effects
request_transformsVec<TaggedRequestTransform>Transforms applied to inference requests before the LLM call
key_registrationsVec<KeyRegistration>State keys to install into the state store at run start
toolsHashMap<String, Arc<dyn Tool>>Plugin-provided tools (merged into the main tool set in Stage 3)
pluginsVec<Arc<dyn Plugin>>Plugin references for lifecycle hooks (on_activate/on_deactivate)

Each TaggedPhaseHook and TaggedRequestTransform carries its owning plugin ID for diagnostics and filtering.

AgentRuntimeBuilder

The builder (AgentRuntimeBuilder) is the standard way to construct an AgentRuntime. It accumulates five registries:

RegistryBuilder methodPurpose
MapAgentSpecRegistrywith_agent_spec() / with_agent_specs()Agent definitions
MapToolRegistrywith_tool()Global tools
MapModelRegistrywith_model()Model ID to provider + model name mappings
MapProviderRegistrywith_provider()LLM executor instances
MapPluginSourcewith_plugin()Plugin instances

Error handling

The builder uses deferred error collection. Each with_* call that detects a conflict (duplicate ID) pushes a BuildError onto an internal error list instead of returning Result. The first collected error surfaces when build() or build_unchecked() is called.

Validation

build() performs a dry-run resolve for every registered agent spec after constructing the runtime. If any agent fails to resolve (missing model, missing provider, missing plugin), the error is collected and returned as BuildError::ValidationFailed. This catches configuration errors at startup rather than at first request.

build_unchecked() skips this validation. Use it only when you need lazy resolution or when agents will be added dynamically after construction.

Remote agents (A2A)

When the a2a feature is enabled, the builder supports with_remote_agents() to register remote A2A endpoints. These are wrapped in a CompositeAgentSpecRegistry that combines local and remote agent sources. Remote agents are discovered asynchronously via build_and_discover().

See Also

State Management

Awaken provides four layers of state management, each designed for a different combination of scope, access pattern, and lifecycle. This page explains when and how to use each layer.

Overview

LayerTraitScopeAccessLifecyclePrimary Use Case
Run StateStateKey (KeyScope::Run)Current run onlySync (snapshot)Cleared at run startTransient counters, flags, step state
Thread StateStateKey (KeyScope::Thread)Same thread, cross-runSync (snapshot)Auto-exported/restored across runsTool call state, active agent, permissions
Shared StateProfileKey + StateScopeDynamic (global, parent thread, agent type, custom)Async (ProfileAccess)Persistent in ProfileStoreCross-boundary sharing, global config
Profile StateProfileKey + key: &strPer-key (agent/system)Async (ProfileAccess)Persistent in ProfileStoreUser/agent preferences, locale

Run State

Run state is the most transient layer. It lives entirely in memory, is accessed synchronously through a Snapshot, and is cleared to its default value when a new run begins.

Writes happen through MutationBatch, which collects updates produced by phase hooks. When multiple hooks run in parallel, the runtime uses MergeStrategy to determine whether concurrent writes to the same key can be safely merged (Commutative) or must be serialized (Exclusive). This makes run state the only layer that participates in the transactional merge protocol.

Typical examples include RunLifecycle (tracking the current run phase), PendingWorkKey (counting outstanding async work), and ContextThrottleState (rate-limiting context injection).

When to use

  • Per-inference temporary state that does not need to survive beyond the current run
  • State that must participate in parallel merge (MutationBatch with MergeStrategy)
  • Counters, flags, and step-tracking metadata

Example

struct StepCounter;
impl StateKey for StepCounter {
    const KEY: &'static str = "step_counter";
    type Value = usize;
    type Update = usize;
    fn apply(value: &mut Self::Value, update: Self::Update) {
        *value += update;
    }
}

// Register in plugin
r.register_key::<StepCounter>(StateKeyOptions::default())?;

// Read via snapshot
let count = ctx.snapshot.get::<StepCounter>().copied().unwrap_or(0);

// Write via MutationBatch
cmd.update::<StepCounter>(1);

Thread State

Thread state shares the same access model as run state – sync reads via Snapshot, transactional writes via MutationBatch, merge-safe through MergeStrategy. The difference is lifecycle: thread-scoped keys persist across runs on the same thread.

The runtime handles this transparently. At the end of a run, thread-scoped keys are exported (serialized). When the next run starts on the same thread, they are restored to their previous values instead of being reset to defaults. From the hook author’s perspective, the key simply “remembers” its value between runs.

Typical examples include ToolCallStates (for resuming suspended tool calls across runs) and ActiveAgentKey (for persisting agent handoff state).

When to use

  • State that must survive across runs on the same thread
  • State that needs sync access and transactional merge guarantees within each run
  • State whose lifecycle should be managed automatically by the runtime

Example

r.register_key::<ToolCallStates>(StateKeyOptions {
    scope: KeyScope::Thread,
    persistent: true,
    ..StateKeyOptions::default()
})?;

Shared State

Shared state is a persistent, async layer built on the ProfileStore backend. It is designed for data that must cross thread and agent boundaries – something neither run state nor thread state can do.

Shared state uses ProfileKey to bind a compile-time namespace to a value type, and a key: &str parameter to identify the runtime instance. Together, (ProfileKey::KEY, key) uniquely identifies a shared state entry. Different agents and threads can read and write the same entry if they use the same key string. The same ProfileAccess methods (read, write, delete) serve both shared and profile state — they all take key: &str.

Because shared state bypasses the snapshot/mutation-batch workflow, it does not participate in transactional merge. Concurrent writes follow last-write-wins semantics. Access is async through ProfileAccess, available in PhaseContext.

StateScope – convenience key builder

ConstructorKey StringScenario
StateScope::global()"global"All agents share one instance
StateScope::parent_thread(id)"parent_thread::{id}"Parent and child agents share within a delegation tree
StateScope::agent_type(name)"agent_type::{name}"All instances of an agent type share
StateScope::thread(id)"thread::{id}"Thread-local persistent state
StateScope::new(s)"{s}"Arbitrary grouping (tenant, region, etc.)

The key is a plain &str – fully extensible without code changes. StateScope is an optional convenience; any raw string works.

When to use

  • State shared across thread boundaries
  • State shared across agent boundaries
  • Dynamic scoping that cannot be determined at compile time
  • Data that serves as database-like indexed storage

Example

use awaken_contract::ProfileKey;

struct TeamContextKey;
impl ProfileKey for TeamContextKey {
    const KEY: &'static str = "team_context";
    type Value = TeamData;
}

// In a hook -- share context with child agents
let scope = StateScope::parent_thread(&ctx.run_identity.parent_thread_id.unwrap());
let mut team = access.read::<TeamContextKey>(scope.as_str()).await?;
team.goals.push("new goal".into());
access.write::<TeamContextKey>(scope.as_str(), &team).await?;

Profile State

Profile state is a persistent, async layer for per-entity preferences. Like shared state, it uses the ProfileStore backend and is accessed through ProfileAccess. The difference is the key convention: instead of a StateScope string, profile state typically uses an agent name or "system" as the key.

A ProfileKey binds a static namespace string to a value type. The key parameter identifies which agent or system entity the data belongs to.

When to use

  • Per-agent persistent preferences (locale, display name, custom settings)
  • System-level configuration shared across all agents
  • Data belonging to a specific agent identity rather than a dynamic group

Example

struct Locale;
impl ProfileKey for Locale {
    const KEY: &'static str = "locale";
    type Value = String;
}

let locale = access.read::<Locale>("alice").await?;
access.write::<Locale>("system", &"en-US".into()).await?;

Decision Guide

Need state during a single run?
  +-- Yes, sync + transactional --> Run State (StateKey, KeyScope::Run)
  +-- No, needs to persist
       +-- Same thread only, sync + transactional --> Thread State (StateKey, KeyScope::Thread)
       +-- Cross-boundary, dynamic key --> Shared State (ProfileKey + StateScope)
       +-- Per-agent/user preference --> Profile State (ProfileKey + agent/system key)

Comparison: Shared State vs Thread State

Both persist data across runs. The key differences:

AspectThread StateShared State
AccessSync (snapshot)Async (ProfileAccess)
ScopeFixed to current threadDynamic (any string)
Merge safetyMutationBatch + strategyLast-write-wins
Cross-boundaryNoYes
LifecycleAuto export/restoreAlways persistent

Use Thread State when you need sync access and transactional guarantees within a run. Use Shared State when you need cross-boundary sharing or dynamic scoping.

See Also

State and Snapshot Model

Awaken uses a typed state engine with snapshot isolation. This page explains the state primitives, scoping rules, merge strategies, and the mutation lifecycle.

StateKey Trait

Every piece of runtime state is declared as a type implementing StateKey:

pub trait StateKey: 'static + Send + Sync {
    const KEY: &'static str;
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;
    const SCOPE: KeyScope = KeyScope::Run;

    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
    type Update: Send + 'static;

    fn apply(value: &mut Self::Value, update: Self::Update);
}

A StateKey is a zero-sized type that associates a string key, a value type, an update type, and a merge function. The apply method defines how an update transforms the current value. Serialization is handled through encode/decode with JSON as the interchange format.

Plugins register their state keys through PluginRegistrar::register_key::<K>() during the Plugin::register callback.

KeyScope

pub enum KeyScope {
    Run,     // Cleared at run start (default)
    Thread,  // Persists across runs on the same thread
}

Run-scoped keys are reset to their default value when a new run begins. Use this for per-run counters, step state, and transient execution metadata.

Thread-scoped keys survive across runs on the same thread. Use this for conversation memory, accumulated context, and persistent agent preferences.

MergeStrategy

pub enum MergeStrategy {
    Exclusive,    // Conflict on concurrent writes (default)
    Commutative,  // Order-independent updates, safe to merge
}

When multiple hooks run in the same phase and produce MutationBatch values that touch the same key:

  • Exclusive – the batches cannot be merged. The runtime detects the conflict and falls back to sequential execution. This is the safe default for keys where write order matters.

  • Commutative – the update operations can be applied in any order and produce the same result. The runtime concatenates the operations from both batches. Use this for append-only logs, counters, and set unions.

Snapshot

A Snapshot is an immutable view of the state at a point in time. Phase hooks receive a snapshot reference and can read any registered key’s value. They cannot mutate the snapshot directly.

Phase hook receives: &Snapshot
Phase hook writes to: MutationBatch

This separation guarantees that hooks within the same phase see identical state regardless of execution order.

MutationBatch

A MutationBatch collects state updates produced by a single hook invocation:

pub struct MutationBatch {
    base_revision: Option<u64>,
    ops: Vec<Box<dyn MutationOp>>,
    touched_keys: Vec<String>,
}

Each operation in ops is a type-erased KeyPatch<K> that carries the K::Update value. When the batch is applied, each operation calls K::apply(value, update) on the target key in the state map.

The touched_keys list enables conflict detection for Exclusive keys during parallel merge.

Mutation Lifecycle

1. Phase starts
2. Runtime takes a Snapshot (immutable clone)
3. Each hook reads from the Snapshot, produces a MutationBatch
4. All hooks complete (phase convergence)
5. Runtime checks MutationBatch key overlap:
   - Exclusive keys overlap -> conflict (sequential fallback)
   - Commutative keys overlap -> ops concatenated
   - No overlap -> batches merged
6. Merged batch applied atomically to the live state
7. New Snapshot taken for the next phase

StateMap

StateMap is the runtime’s typed state container. It uses typedmap::TypedMap internally, keyed by zero-sized TypedKey<K> wrappers that hash and compare by TypeId. The get::<K>() and get_or_insert_default::<K>() methods retrieve the concrete K::Value type directly without downcasting.

State keys registered with persistent: true in StateKeyOptions are serialized during checkpoint and restored on thread load. Non-persistent keys exist only in memory for the duration of the run.

StateStore

StateStore wraps the StateMap and provides:

  • Snapshot creation (cheap Arc clone of the inner map)
  • Batch application with revision tracking
  • Commit hooks (CommitHook) that fire after each successful state mutation
  • StateCommand processing for programmatic state operations

Shared State

For state that must be shared across thread and agent boundaries, awaken provides a separate persistent layer built on the same ProfileStore backend used by profile data.

ProfileKey for shared state

Shared state uses the same ProfileKey trait as profile data. The KEY constant serves as the namespace, and the key: &str parameter passed to ProfileAccess methods identifies the runtime instance:

pub trait ProfileKey: 'static + Send + Sync {
    const KEY: &'static str;
    type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
}

Unlike StateKey, shared state does not participate in the MutationBatch / snapshot workflow – it is accessed asynchronously through ProfileAccess with a key: &str parameter.

StateScope

pub struct StateScope(String);

An optional convenience builder for common key string patterns. Constructors: global(), parent_thread(id), agent_type(name), thread(id), new(arbitrary). Call .as_str() to get the key string for ProfileAccess methods. Users can also pass any raw &str directly.

State Management Overview

LayerScopeAccessLifecycleUse Case
Run StateCurrent runSync snapshotCleared at run startTransient flags, counters
Thread StateSame threadSync snapshotExport/restore across runsTool call state, active agent
Shared StateProfileKey + StateScopeAsyncPersistentCross-boundary sharing
Profile StateProfileKey + key: &strAsyncPersistentUser/agent preferences

Choose shared state when you need dynamic scoping or cross-boundary access. Choose StateKey when you need sync access or transactional merge during parallel tool execution.

See Also

Run Lifecycle and Phases

This page describes the state machines that govern run execution and tool call processing, the phase enum, termination conditions, and checkpoint triggers.

RunStatus

A run’s coarse lifecycle is captured by RunStatus:

Running --+--> Waiting --+--> Running (resume)
          |              |
          +--> Done      +--> Done
pub enum RunStatus {
    Running,  // Actively executing (default)
    Waiting,  // Paused, waiting for external decisions
    Done,     // Terminal -- cannot transition further
}
  • Running -> Waiting: a tool call suspends, the run pauses for external input.
  • Waiting -> Running: decisions arrive, the run resumes.
  • Running -> Done or Waiting -> Done: terminal transition on completion, cancellation, or error.
  • Done -> *: not allowed. Terminal state.

ToolCallStatus

Each tool call in a run has its own lifecycle:

New --> Running --+--> Succeeded (terminal)
                  +--> Failed (terminal)
                  +--> Cancelled (terminal)
                  +--> Suspended --> Resuming --+--> Running
                                                +--> Suspended (re-suspend)
                                                +--> Succeeded/Failed/Cancelled
pub enum ToolCallStatus {
    New,        // Created, not yet executing
    Running,    // Currently executing
    Suspended,  // Waiting for external decision
    Resuming,   // Decision received, about to re-execute
    Succeeded,  // Completed successfully (terminal)
    Failed,     // Completed with error (terminal)
    Cancelled,  // Cancelled externally (terminal)
}

Key transitions:

  • Suspended can only move to Resuming or Cancelled – it cannot jump directly to Running or a success/failure state.
  • Resuming has wide transitions: it can re-enter Running, re-suspend, or reach any terminal state.
  • Terminal states (Succeeded, Failed, Cancelled) cannot transition to any non-self state.

Phase Enum

The Phase enum defines the eight execution phases in order:

pub enum Phase {
    RunStart,
    StepStart,
    BeforeInference,
    AfterInference,
    BeforeToolExecute,
    AfterToolExecute,
    StepEnd,
    RunEnd,
}

RunStart – fires once at the beginning of a run. Plugins initialize run-scoped state.

StepStart – fires at the beginning of each inference round. Step counter increments.

BeforeInference – last chance to modify the inference request (system prompt, tools, parameters). Plugins can skip inference by setting a behavior flag.

AfterInference – fires after the LLM response arrives. Plugins can inspect the response, modify tool call lists, or request termination.

BeforeToolExecute – fires before each tool call batch. Permission checks, interception, and suspension happen here.

AfterToolExecute – fires after tool results are available. Plugins can inspect results and trigger side effects.

StepEnd – fires at the end of each inference round. Checkpoint persistence happens here. Stop conditions (max rounds, token budget, loop detection) are evaluated.

RunEnd – fires once when the run terminates, regardless of reason. Cleanup and final state persistence.

TerminationReason

When a run ends, the TerminationReason records why:

pub enum TerminationReason {
    NaturalEnd,           // LLM returned no tool calls
    BehaviorRequested,    // A plugin requested inference skip
    Stopped(StoppedReason), // A stop condition fired (code + optional detail)
    Cancelled,            // External cancellation signal
    Blocked(String),      // Permission checker blocked the run
    Suspended,            // Waiting for external tool-call resolution
    Error(String),        // Error path
}

TerminationReason::to_run_status() maps each variant to the appropriate RunStatus:

  • Suspended maps to RunStatus::Waiting (the run can resume).
  • All other variants map to RunStatus::Done.

Stop Conditions

Declarative stop conditions are configured per agent via StopConditionSpec:

VariantTrigger
MaxRounds { rounds }Step count exceeds limit
Timeout { seconds }Wall-clock time exceeds limit
TokenBudget { max_total }Cumulative token usage exceeds budget
ConsecutiveErrors { max }Sequential tool errors exceed threshold
StopOnTool { tool_name }A specific tool is called
ContentMatch { pattern }LLM output matches a regex pattern
LoopDetection { window }Repeated identical tool calls within a sliding window

Stop conditions are evaluated at StepEnd. When one fires, the run terminates with TerminationReason::Stopped.

Checkpoint Triggers

State is persisted at StepEnd after each inference round. The checkpoint includes:

  • Thread messages (append-only)
  • Run lifecycle state (RunStatus, step count, termination reason)
  • Persistent state keys (those registered with persistent: true)
  • Tool call states for suspended calls

Checkpoints enable resume from the last completed step after a crash or intentional suspension.

RunStatus Derived from ToolCall States

A run’s status is a projection of all its tool call states. Each tool call has an independent lifecycle; the run status is the aggregate:

fn derive_run_status(calls: &HashMap<String, ToolCallState>) -> RunStatus {
    let mut has_suspended = false;
    for state in calls.values() {
        match state.status {
            // Any Running or Resuming call → Run is still executing
            ToolCallStatus::Running | ToolCallStatus::Resuming => {
                return RunStatus::Running;
            }
            ToolCallStatus::Suspended => {
                has_suspended = true;
            }
            // Succeeded / Failed / Cancelled are terminal — keep checking
            _ => {}
        }
    }
    if has_suspended {
        RunStatus::Waiting   // No executing calls, but some await decisions
    } else {
        RunStatus::Done      // All calls in terminal state → step complete
    }
}

Decision table:

Any Running/Resuming?Any Suspended?Run StatusMeaning
YesRunningTools are actively executing
NoYesWaitingAll execution done, awaiting external decisions
NoNoDoneAll calls terminal → proceed to next step

Parallel tool call state timeline

When an LLM returns multiple tool calls (e.g. [tool_A, tool_B, tool_C]), their states evolve independently:

Time  tool_A(approval-req)  tool_B(approval-req)  tool_C(normal)   → Run Status
────────────────────────────────────────────────────────────────
t0    Created        Created        Created        Running     Step starts
t1    Suspended      Created        Running        Running     tool_A intercepted
t2    Suspended      Suspended      Running        Running     tool_B intercepted, tool_C executing
t3    Suspended      Suspended      Succeeded      Waiting     tool_C done, no Running calls
t4    Resuming       Suspended      Succeeded      Running     tool_A decision arrives
t5    Succeeded      Suspended      Succeeded      Waiting     tool_A replay done
t6    Succeeded      Resuming       Succeeded      Running     tool_B decision arrives
t7    Succeeded      Succeeded      Succeeded      Done        All terminal → next step

At every transition the run status is re-derived from the aggregate of all call states. This means a single decision arriving does not end the wait — the run stays in Waiting until all suspended calls are resolved.

Suspension Bridges Run and Tool-Call Layers

Current execution model (serial phases)

Tool execution is split into two serial phases inside execute_tools_with_interception:

Phase 1 — Intercept (serial, per-call):
  for each call:
    BeforeToolExecute hooks → check for intercept actions
    Suspend?  → mark Suspended, set suspended=true, continue
    Block?    → mark Failed, return immediately
    SetResult → mark with provided result, continue
    None      → add to allowed_calls

Phase 2 — Execute (allowed_calls only):
  Sequential mode: one by one, break on first suspension
  Parallel mode:   batch execute, collect all results

After both phases, if suspended == true, the step returns StepOutcome::Suspended. The orchestrator then:

  1. Persists checkpoint (messages, tool call states)
  2. Emits RunFinish(Suspended) to protocol encoders
  3. Enters wait_for_resume_or_cancel loop

wait_for_resume_or_cancel loop

loop {
    let decisions = decision_rx.next().await;  // block until decisions arrive
    emit_decision_events_and_messages(decisions);
    prepare_resume(decisions);       // Suspended → Resuming
    detect_and_replay_resume();      // re-execute Resuming calls
    if !has_suspended_calls() {
        return WaitOutcome::Resumed; // all resolved → exit wait
    }
    // Some calls still Suspended → continue waiting
}

Key properties:

  • The loop handles partial resume: if only tool_A’s decision arrives but tool_B is still suspended, tool_A is replayed immediately and the loop continues waiting for tool_B.
  • Decisions can arrive in batches or one at a time.
  • On WaitOutcome::Resumed, the orchestrator re-enters the step loop for the next LLM inference round.

Resume replay

detect_and_replay_resume scans ToolCallStates for calls with status == Resuming and re-executes them through the standard tool pipeline. The arguments field already reflects the resume mode (set by prepare_resume):

Resume ModeArguments on ReplayBehavior
ReplayToolCallOriginal argumentsFull re-execution
UseDecisionAsToolResultDecision resultFrontendToolPlugin intercepts in BeforeToolExecute, returns SetResult
PassDecisionToToolDecision resultTool receives decision as arguments

Already-completed calls (Succeeded, Failed, Cancelled) are skipped.

Limitation: decisions during execution

In the current serial model, decisions that arrive while Phase 2 tools are still executing sit in the channel buffer. They are only consumed when the step finishes and the orchestrator enters wait_for_resume_or_cancel.

This means:

  • tool_A’s approval arrives at t2 (while tool_C is executing)
  • tool_A is not replayed until t3 (after tool_C finishes)
  • The delay equals the remaining execution time of Phase 2 tools

Concurrent Execution Model (future)

The ideal model executes suspended-tool waits and allowed-tool execution in parallel, so a decision for tool_A can trigger immediate replay even while tool_C is still running.

Architecture

Phase 1 — Intercept (same as current)

Phase 2 — Concurrent execution:
  ┌─ task: execute(tool_C) ──────────────────────────┐
  │                                                   │
  ├─ task: execute(tool_D) ────────────┐              │
  │                                    │              │
  ├─ task: wait_decision(tool_A) → replay(tool_A) ──┐│
  │                                                  ││
  ├─ task: wait_decision(tool_B) ──────────→ replay(tool_B)
  │                                                  │
  └─ barrier: all tasks reach terminal state ────────┘

Per-call decision routing

The shared decision_rx channel carries batches of decisions for multiple tool calls. A dispatcher task demuxes decisions to per-call notification channels:

struct ToolCallWaiter {
    waiters: HashMap<String, oneshot::Sender<ToolCallResume>>,
}

impl ToolCallWaiter {
    async fn dispatch_loop(&mut self, decision_rx: &mut UnboundedReceiver<DecisionBatch>) {
        while let Some(batch) = decision_rx.next().await {
            for (call_id, resume) in batch {
                if let Some(tx) = self.waiters.remove(&call_id) {
                    let _ = tx.send(resume);
                }
            }
            if self.waiters.is_empty() { break; }
        }
    }
}

Each suspended tool call gets a oneshot::Receiver. When its decision arrives, the receiver wakes the task, which runs the replay immediately — concurrently with any still-executing allowed tools.

State transition timing

With the concurrent model, state transitions happen as events occur rather than in batches:

t0: tool_C starts executing          → RunStatus: Running
t1: tool_A decision arrives, replay  → RunStatus: Running (tool_A Resuming)
t2: tool_A replay completes          → RunStatus: Running (tool_C still Running)
t3: tool_C completes                 → RunStatus: Waiting (tool_B still Suspended)
t4: tool_B decision arrives, replay  → RunStatus: Running (tool_B Resuming)
t5: tool_B replay completes          → RunStatus: Done (all terminal)

No artificial delay — each tool call progresses as fast as its external dependency allows.

Protocol Adapter: SSE Reconnection

A backend run may span multiple frontend SSE connections. This is especially relevant for the AI SDK v6 protocol, where each HTTP request corresponds to one “turn” and produces one SSE stream.

Problem

Turn 1 (user message):
  HTTP POST → SSE stream 1 → events flow → tool suspends
  → RunFinish(Suspended) → finish event → SSE stream 1 closes

  The run is still alive in wait_for_resume_or_cancel.
  But the event_tx channel from SSE stream 1 has been dropped.

Turn 2 (tool output / approval):
  HTTP POST → SSE stream 2 → decision delivered to orchestrator
  → orchestrator resumes → emits events
  → events go to... the dropped event_tx? Lost.

Solution: ReconnectableEventSink

Replace the fixed ChannelEventSink with a reconnectable wrapper that allows swapping the underlying channel sender:

struct ReconnectableEventSink {
    inner: Arc<tokio::sync::Mutex<mpsc::UnboundedSender<AgentEvent>>>,
}

impl ReconnectableEventSink {
    fn new(tx: mpsc::UnboundedSender<AgentEvent>) -> Self {
        Self { inner: Arc::new(tokio::sync::Mutex::new(tx)) }
    }

    /// Replace the underlying channel. Called when a new SSE connection
    /// arrives for an existing suspended run.
    async fn reconnect(&self, new_tx: mpsc::UnboundedSender<AgentEvent>) {
        *self.inner.lock().await = new_tx;
    }
}

#[async_trait]
impl EventSink for ReconnectableEventSink {
    async fn emit(&self, event: AgentEvent) {
        let _ = self.inner.lock().await.send(event);
    }
}

Reconnection flow

Turn 1:
  submit() → create (event_tx1, event_rx1)
           → ReconnectableEventSink(event_tx1)
           → spawn_execution (run starts)
  events → event_tx1 → event_rx1 → SSE stream 1
  tool suspends → finish(tool-calls) → SSE stream 1 closes
  event_tx1 still held by ReconnectableEventSink (sends fail silently)
  run alive in wait_for_resume_or_cancel

Turn 2:
  new HTTP request with decisions arrives
  create (event_tx2, event_rx2)
  sink.reconnect(event_tx2)          ← swap channel
  send_decision → decision channel → orchestrator resumes
  events → ReconnectableEventSink → event_tx2 → event_rx2 → SSE stream 2
  run completes → RunFinish(NaturalEnd) → SSE stream 2 closes

No events are lost between SSE connections because:

  • During suspend, the orchestrator is blocked in wait_for_resume_or_cancel and emits no events.
  • reconnect() completes before send_decision(), so the first resume event (RunStart) goes to the new channel.

Protocol-specific behavior

ProtocolSuspend SignalResume Mechanism
AI SDK v6finish(finishReason: "tool-calls")New HTTP POST → reconnect → send_decision
AG-UIRUN_FINISHED(outcome: "interrupt")New HTTP POST → reconnect → send_decision
CopilotKitrenderAndWaitForResponse UISame SSE or new request via AG-UI

See Also

HITL and Mailbox

This page explains how Awaken handles human-in-the-loop (HITL) interactions through tool call suspension and the mailbox queue that manages run requests.

SuspendTicket

When a tool call needs external approval or input, it produces a SuspendTicket:

pub struct SuspendTicket {
    pub suspension: Suspension,
    pub pending: PendingToolCall,
    pub resume_mode: ToolCallResumeMode,
}

suspension – the external-facing payload describing what input is needed:

pub struct Suspension {
    pub id: String,             // Unique suspension ID
    pub action: String,         // Action identifier (e.g., "confirm", "approve")
    pub message: String,        // Human-readable prompt
    pub parameters: Value,      // Action-specific parameters
    pub response_schema: Option<Value>,  // JSON Schema for expected response
}

pending – the tool call projection emitted to the event stream so the frontend knows which call is waiting:

pub struct PendingToolCall {
    pub id: String,        // Tool call ID
    pub name: String,      // Tool name
    pub arguments: Value,  // Original arguments
}

resume_mode – how the agent loop should handle the decision when it arrives.

ToolCallResumeMode

pub enum ToolCallResumeMode {
    ReplayToolCall,          // Re-execute the original tool call
    UseDecisionAsToolResult, // Use the decision payload as the tool result directly
    PassDecisionToTool,      // Pass the decision payload into the tool as new arguments
}

ReplayToolCall is the default. The original tool call is re-executed after the decision arrives. Use this when the decision unlocks execution (e.g., permission granted, now run the tool).

UseDecisionAsToolResult skips re-execution entirely. The external decision payload becomes the tool result. Use this when a human provides the answer directly (e.g., “the correct value is X”).

PassDecisionToTool re-executes the tool but injects the decision payload into the arguments. Use this when the decision modifies how the tool should run (e.g., “use this alternative path instead”).

ResumeDecisionAction

pub enum ResumeDecisionAction {
    Resume,  // Proceed with the tool call
    Cancel,  // Cancel the tool call
}

Each decision carries an action. Resume continues execution according to the ToolCallResumeMode. Cancel transitions the tool call to ToolCallStatus::Cancelled.

ToolCallResume

The full resume payload:

pub struct ToolCallResume {
    pub decision_id: String,         // Idempotency key
    pub action: ResumeDecisionAction,
    pub result: Value,               // Decision payload
    pub reason: Option<String>,      // Human-readable reason
    pub updated_at: u64,             // Unix millis timestamp
}

Permission Plugin and Ask-Mode

The awaken-ext-permission plugin uses suspension to implement ask-mode approvals:

  1. A tool call matches a permission rule with behavior: ask.
  2. The permission checker creates a SuspendTicket with ToolCallResumeMode::ReplayToolCall.
  3. The suspension payload describes the tool and its arguments.
  4. The tool call transitions to ToolCallStatus::Suspended.
  5. The run transitions to RunStatus::Waiting.
  6. A frontend presents the approval prompt to the user.
  7. The user submits a ToolCallResume with ResumeDecisionAction::Resume or Cancel.
  8. On Resume, the tool call replays and executes normally.
  9. On Cancel, the tool call is cancelled and the run may continue with remaining calls.

Mailbox Architecture

The mailbox provides a persistent queue for all run requests. Every run – streaming, background, A2A, internal – enters the system as a MailboxJob.

MailboxJob

pub struct MailboxJob {
    // Identity
    pub job_id: String,          // UUID v7
    pub mailbox_id: String,      // Thread ID (routing anchor)

    // Request payload
    pub agent_id: String,
    pub messages: Vec<Message>,
    pub origin: MailboxJobOrigin,
    pub sender_id: Option<String>,
    pub parent_run_id: Option<String>,
    pub request_extras: Option<Value>,

    // Queue semantics
    pub priority: u8,            // 0 = highest, 255 = lowest, default 128
    pub dedupe_key: Option<String>,
    pub generation: u64,

    // Lifecycle
    pub status: MailboxJobStatus,
    pub available_at: u64,
    pub attempt_count: u32,
    pub max_attempts: u32,
    pub last_error: Option<String>,

    // Lease
    pub claim_token: Option<String>,
    pub claimed_by: Option<String>,
    pub lease_until: Option<u64>,

    // Timestamps
    pub created_at: u64,
    pub updated_at: u64,
}

MailboxJobStatus

Queued --claim--> Claimed --ack--> Accepted (terminal)
  |                  |
  |               nack(retry) --> Queued (attempt_count++, available_at = retry_at)
  |                  |
  |               nack(permanent) --> DeadLetter (terminal)
  |
  |-- cancel --> Cancelled (terminal)
  +-- interrupt(generation bump) --> Superseded (terminal)
pub enum MailboxJobStatus {
    Queued,      // Waiting to be claimed
    Claimed,     // Claimed by a consumer, executing
    Accepted,    // Successfully processed (terminal)
    Cancelled,   // Cancelled by caller (terminal)
    Superseded,  // Replaced by a newer generation (terminal)
    DeadLetter,  // Permanently failed (terminal)
}

MailboxJobOrigin

pub enum MailboxJobOrigin {
    User,      // HTTP API, SDK
    A2A,       // Agent-to-Agent protocol
    Internal,  // Child run notification, handoff
}

MailboxStore Trait

MailboxStore defines the persistent queue interface:

  • enqueue – persist a job, auto-assign generation, reject duplicate dedupe_key
  • claim – atomically claim up to N Queued jobs for a mailbox (lease-based)
  • claim_job – claim a specific job by ID (for inline streaming)
  • ack – mark a job as Accepted (validates claim token)
  • nack – return a job to Queued for retry, or DeadLetter if max attempts exceeded
  • cancel – cancel a Queued job
  • extend_lease – heartbeat to extend an active claim
  • interrupt – atomically bump generation, supersede stale Queued jobs, return the active Claimed job for cancellation

Implementations must guarantee: durable enqueue, atomic claim (exactly one winner), claim token validation on ack/nack, and atomic interrupt with generation bump.

MailboxInterrupt

When a new high-priority request arrives for a thread that already has queued or running work:

pub struct MailboxInterrupt {
    pub new_generation: u64,       // Generation after bump
    pub active_job: Option<MailboxJob>,  // Currently running job to cancel
    pub superseded_count: usize,   // Queued jobs superseded
}

The caller cancels the active_job’s runtime run if present, ensuring the new request takes priority.

See Also

Multi-Agent Patterns

Awaken supports multiple patterns for composing agents. This page describes delegation, remote agents, sub-agent execution, and handoff.

Agent Delegation via AgentSpec.delegates

An agent can declare sub-agents it is allowed to delegate to:

{
  "id": "orchestrator",
  "model": "gpt-4o",
  "system_prompt": "You coordinate tasks across specialized agents.",
  "delegates": ["researcher", "writer", "reviewer"]
}

Each ID in delegates must be a registered agent in the AgentSpecRegistry. During resolution, the runtime creates an AgentTool for each delegate. From the LLM’s perspective, each sub-agent appears as a regular tool named agent_run_{delegate_id}.

When the LLM calls a delegate tool, the AgentTool dispatches to the appropriate backend:

  • Local agents (no endpoint field) use LocalBackend, which resolves and executes the sub-agent inline within the same runtime.
  • Remote agents (with endpoint field) use A2aBackend, which sends an A2A message:send request and polls the resulting task for completion.

Remote Agents via A2A

Remote agents are declared with an endpoint in AgentSpec:

{
  "id": "remote-analyst",
  "model": "unused-for-remote",
  "system_prompt": "",
  "endpoint": {
    "backend": "a2a",
    "base_url": "https://analyst.example.com/v1/a2a",
    "auth": { "type": "bearer", "token": "token-abc" },
    "target": "analyst",
    "timeout_ms": 300000,
    "options": {
      "poll_interval_ms": 1000
    }
  }
}

The A2aBackend handles the A2A protocol lifecycle:

  1. Sends a message:send request with the user message.
  2. Receives a task wrapper, extracts task.id, and polls /tasks/:task_id at the configured interval.
  3. Returns the completed response as a DelegateRunResult.
  4. The result is formatted as a ToolResult and returned to the parent agent’s LLM context.

If the remote agent times out or fails, the DelegateRunStatus reflects the failure and the parent agent receives an error tool result.

Sub-Agent Patterns

Sequential Delegation

The orchestrator calls sub-agents one at a time, using each result to decide the next step:

Orchestrator -> researcher (tool call) -> result
             -> writer (tool call, using researcher output) -> result
             -> reviewer (tool call, using writer output) -> result

Each delegation is a tool call within the orchestrator’s step loop. The orchestrator sees tool results and decides whether to delegate further or respond directly.

Parallel Delegation

When the LLM emits multiple tool calls in a single inference response, sub-agent delegations can execute concurrently. The runtime processes tool calls in parallel by default – if the LLM calls both agent_run_researcher and agent_run_writer in the same response, both execute simultaneously.

Nested Delegation

Sub-agents can themselves have delegates, creating hierarchies:

orchestrator
  -> team_lead (delegates: [dev_a, dev_b])
       -> dev_a
       -> dev_b

Each level resolves independently through the AgentResolver. There is no hard depth limit, but each level adds latency and token cost.

Agent Handoff

Handoff transfers control from one agent to another mid-run without stopping the loop. The mechanism:

  1. A plugin (or the handoff extension) writes a new agent ID to the ActiveAgentKey state key.
  2. At the next step boundary, the loop runner detects the changed key.
  3. The loop re-resolves the agent from the AgentResolver – new config, new model, new tools, new system prompt.
  4. Execution continues in the same run with the new agent’s configuration.

Handoff is a re-resolve, not a loop restart. Thread history is preserved. The new agent sees all prior messages and can continue the conversation seamlessly.

Handoff vs Delegation

AspectDelegationHandoff
Control flowParent calls sub-agent as tool, gets result backControl transfers entirely to new agent
Thread continuitySub-agent may use a separate thread contextSame thread, same message history
Return pathResult flows back to parent LLMNo return – new agent owns the run
Use caseTask decomposition, specialized subtasksRole switching, escalation, routing

AgentBackend Trait

Both local and remote delegation use the AgentBackend trait:

pub trait AgentBackend: Send + Sync {
    async fn execute(
        &self,
        agent_id: &str,
        messages: Vec<Message>,
        event_sink: Arc<dyn EventSink>,
        parent_run_id: Option<String>,
        parent_tool_call_id: Option<String>,
    ) -> Result<DelegateRunResult, AgentBackendError>;
}

DelegateRunResult carries the agent ID, terminal status, optional response text, and step count. DelegateRunStatus variants: Completed, Failed(String), Cancelled, Timeout.

This trait is the extension point for custom delegation backends beyond local and A2A.

See Also

Tool and Plugin Boundary

Awaken separates user-facing capabilities (tools) from system-level lifecycle logic (plugins). This page explains the design boundary, when to use each, and how they interact.

Tools

A tool is a capability exposed to the LLM. The LLM decides when to call it based on the tool’s descriptor.

pub trait Tool: Send + Sync {
    fn descriptor(&self) -> ToolDescriptor;
    fn validate_args(&self, args: &Value) -> Result<(), ToolError> { Ok(()) }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError>;
}

Key properties:

  • Registered by ID – each tool has a unique string identifier from ToolDescriptor.
  • LLM-visible – the descriptor (name, description, JSON Schema for parameters) is included in inference requests.
  • Executed in tool rounds – tools run during the BeforeToolExecute / AfterToolExecute phase window.
  • Isolated – a tool receives only its arguments and a ToolCallContext. It cannot directly access other tools, the plugin system, or the phase runtime.
  • User-defined – application code creates tools for domain-specific actions (file operations, API calls, database queries).

Plugins

A plugin is a system-level extension that hooks into the execution lifecycle. Plugins do not appear in the LLM’s tool list.

pub trait Plugin: Send + Sync + 'static {
    fn descriptor(&self) -> PluginDescriptor;
    fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError>;
    fn config_schemas(&self) -> Vec<ConfigSchema> { vec![] }
}

Key properties:

  • Registered by ID – each plugin has a unique identifier from PluginDescriptor.
  • LLM-invisible – the LLM does not know plugins exist.
  • Run in phases – plugin hooks fire at phase boundaries (RunStart, BeforeInference, AfterToolExecute, etc.).
  • System-wide access – hooks receive PhaseContext with access to the state snapshot, event sink, and run metadata.
  • Declarative registration – plugins declare their capabilities through PluginRegistrar.

PluginRegistrar

When Plugin::register is called, the plugin uses PluginRegistrar to declare everything it needs:

Registration MethodPurpose
register_key::<K>()Declare a StateKey for the plugin’s state
register_phase_hook()Add a hook that runs at a specific phase
register_tool()Inject a tool into the runtime (plugin-provided tools)
register_effect_handler()Handle named effects emitted by hooks
register_scheduled_action()Handle named actions scheduled by hooks
register_request_transform()Transform inference requests before they reach the LLM

Plugins can register tools. This is how delegation tools (AgentTool) and MCP tools enter the runtime – they are tools owned by plugins, not directly registered by user code. The LLM sees these tools identically to user-registered tools.

When to Use a Tool

Use a tool when:

  • The LLM should be able to invoke the capability by name.
  • The capability performs a domain-specific action (search, compute, file I/O, API call).
  • The capability takes structured input and returns a result the LLM can reason about.
  • The logic does not need access to runtime internals (state, phases, other tools).

When to Use a Plugin

Use a plugin when:

  • The logic should run automatically at phase boundaries without LLM involvement.
  • The logic needs to modify inference requests (system prompt injection, tool filtering).
  • The logic needs to inspect or transform tool results before the LLM sees them.
  • The logic manages cross-cutting concerns (permissions, observability, reminders, state synchronization).
  • The logic needs to register state keys, effect handlers, or request transforms.

Interaction Between Tools and Plugins

Plugins can influence tool execution without the tool knowing:

  • Permission plugin – intercepts tool calls at BeforeToolExecute, blocks or suspends them based on policy rules. The tool itself has no permission logic.
  • Observability plugin – wraps tool execution with OpenTelemetry spans. The tool does not emit traces.
  • Reminder plugin – injects context messages after specific tools execute. The tool does not know about reminders.
  • Interception pipeline – plugins can modify tool arguments, replace tool results, or skip execution entirely through the tool interception mechanism.

This separation means tools remain simple and focused on their domain logic. Cross-cutting concerns are handled uniformly by plugins, applied consistently across all tools without per-tool opt-in.

Plugin-Provided Tools vs User Tools

AspectUser ToolPlugin Tool
RegistrationAgentRuntimeBuilder::with_tool()PluginRegistrar::register_tool()
LifecycleExists for the runtime’s lifetimeExists while the plugin is active
ConfigurationDirect constructionDerived from plugin config or agent spec
ExamplesCustom business logic toolsAgentTool (delegation), MCP tools, skill tools

Both appear identically to the LLM. The distinction is purely about ownership and lifecycle management.

See Also

Plugin System Internals

This page covers the internal mechanics of the plugin system: how plugins are registered and activated, how hooks execute and resolve conflicts, how the phase convergence loop works, and how request transforms, effects, inference overrides, and tool intercepts behave at runtime.

For the high-level boundary between tools and plugins, see Tool and Plugin Boundary. For the phase lifecycle, see Run Lifecycle and Phases.

Plugin Registration vs Activation

When a plugin is loaded, its register() method is called with a PluginRegistrar. The plugin declares two categories of components:

Structural components are always available regardless of activation state:

  • State keys (register_key::<K>())
  • Scheduled action handlers (register_scheduled_action::<A, H>())
  • Effect handlers (register_effect::<E, H>())

Behavioral components are only active when the plugin passes the activation filter:

  • Phase hooks (register_phase_hook())
  • Tools (register_tool())
  • Request transforms (register_request_transform())

Activation is controlled by AgentSpec.active_hook_filter:

active_hook_filter valueBehavior
Empty (default)All plugins’ behavioral components are active
Non-empty setOnly plugins whose ID is in the set contribute behavioral components

This separation enables infrastructure plugins (state management, action handlers, effect handlers) to exist without impacting execution flow. A logging plugin that only registers an effect handler, for example, never needs to appear in active_hook_filter – its handler fires whenever any plugin emits the corresponding effect.

The filtering happens in PhaseRuntime::filter_hooks():

fn filter_hooks<'a>(env: &'a ExecutionEnv, ctx: &PhaseContext) -> Vec<&'a TaggedPhaseHook> {
    let hooks = env.hooks_for_phase(ctx.phase);
    let active_hook_filter = &ctx.agent_spec.active_hook_filter;
    hooks
        .iter()
        .filter(|tagged| {
            active_hook_filter.is_empty()
                || active_hook_filter.contains(&tagged.plugin_id)
        })
        .collect()
}

Hook Ordering and Conflict Resolution

Multiple plugins can register hooks for the same Phase. The engine runs them through a two-stage process implemented in gather_and_commit_hooks() (crates/awaken-runtime/src/phase/engine.rs).

Fast path: parallel execution, single commit

All hooks for a phase run in parallel against a frozen snapshot. Each hook receives the same snapshot and produces a StateCommand. If no hook writes to a key with MergeStrategy::Exclusive that another hook also writes to, all commands are merged and committed in a single batch.

flowchart LR
    S[Frozen Snapshot] --> H1[Hook A]
    S --> H2[Hook B]
    S --> H3[Hook C]
    H1 --> M[Merge All]
    H2 --> M
    H3 --> M
    M --> C[Single Commit]

Conflict fallback: partition and serial retry

If two or more hooks write to the same Exclusive key, the engine detects the conflict and falls back:

  1. Partition – Walk commands in registration order. Greedily add each command to a “compatible batch” if its Exclusive keys do not overlap with keys already in the batch. Otherwise, defer the hook.
  2. Commit the batch – The compatible batch is merged and committed.
  3. Serial re-execution – Deferred hooks are re-run one at a time, each against a fresh snapshot that includes the results of all prior commits.
flowchart TD
    PAR[Run all hooks in parallel] --> CHECK{Exclusive key overlap?}
    CHECK -- No --> FAST[Merge all, commit once]
    CHECK -- Yes --> PART[Partition: batch + deferred]
    PART --> CBATCH[Commit compatible batch]
    CBATCH --> SERIAL[Re-run deferred hooks serially]
    SERIAL --> DONE[All hooks committed]

Because hooks are pure functions (frozen snapshot in, StateCommand out, no side effects), re-execution on conflict is always safe. The deferred hooks see the updated state from the batch commit, so they produce correct results even when the original parallel execution would have conflicted.

See State and Snapshot Model for details on MergeStrategy and snapshot isolation.

Phase Convergence Loop

Each phase runs a GATHER then EXECUTE loop that converges when no new work remains.

GATHER stage

Run all active hooks in parallel (with conflict resolution as described above). Hooks produce StateCommand values that may contain:

  • State mutations (key updates)
  • Scheduled actions (to be processed in the EXECUTE stage)
  • Effects (dispatched immediately after commit)

EXECUTE stage

Process pending scheduled actions whose Phase matches the current phase and whose action key has a registered handler. Each action handler runs against a fresh snapshot and produces its own StateCommand, which may schedule new actions for the same phase.

If new matching actions appear after processing, the loop repeats:

flowchart TD
    START[Phase Entry] --> GATHER[GATHER: run hooks, commit]
    GATHER --> EXEC[EXECUTE: process matching actions]
    EXEC --> CHECK{New actions for this phase?}
    CHECK -- Yes --> EXEC
    CHECK -- No --> DONE[Phase complete]

The loop is bounded by DEFAULT_MAX_PHASE_ROUNDS (16). If the action count does not converge within this limit, the engine returns StateError::PhaseRunLoopExceeded. This prevents infinite reactive chains while allowing legitimate multi-step action cascades.

This convergence design enables reactive patterns: a permission check action can schedule a suspend action in the same BeforeToolExecute phase, and both are processed before the phase completes.

Request Transform Hooks

Plugins can register InferenceRequestTransform implementations via registrar.register_request_transform(). Transforms modify the InferenceRequest before it reaches the LLM executor.

Use cases:

  • System prompt injection – append context, instructions, or reminders to the system message
  • Tool list modification – filter, reorder, or augment the tool descriptors sent to the LLM
  • Parameter overrides – adjust temperature, max tokens, or other inference parameters

Transforms run in registration order and are composable: each transform receives the request as modified by the previous transform.

Only active plugins’ transforms are applied. If active_hook_filter is non-empty, transforms from plugins not in the filter are skipped.

Effect Handlers

Effects are typed events defined via the EffectSpec trait:

pub trait EffectSpec: 'static + Send + Sync {
    const KEY: &'static str;
    type Payload: Serialize + DeserializeOwned + Send + Sync + 'static;
}

Hooks and action handlers emit effects by calling command.emit::<E>(payload) on their StateCommand. Unlike scheduled actions (which execute within a specific phase’s convergence loop), effects are dispatched after the command is committed to the store.

Plugins register effect handlers via registrar.register_effect::<E, H>(handler). When an effect is dispatched, the engine calls the handler with the effect payload and the current snapshot.

Key properties:

  • Fire-and-forget – handler failures are logged but do not block execution or roll back the commit.
  • Post-commit – effects see the state after the command that emitted them has been applied.
  • Validated at submit time – if a command emits an effect with no registered handler, submit_command returns StateError::UnknownEffectHandler immediately, preventing silent drops.

Use cases: audit logging, metric emission, cross-plugin notification, external system synchronization.

InferenceOverride Merging

Multiple plugins can influence inference parameters by scheduling SetInferenceOverride actions. The InferenceOverride struct uses last-wins-per-field merge semantics:

pub struct InferenceOverride {
    pub model: Option<String>,
    pub fallback_models: Option<Vec<String>>,
    pub temperature: Option<f64>,
    pub max_tokens: Option<u32>,
    pub top_p: Option<f64>,
    pub reasoning_effort: Option<ReasoningEffort>,
}

When two overrides are merged, each field independently takes the last non-None value:

pub fn merge(&mut self, other: InferenceOverride) {
    if other.temperature.is_some() {
        self.temperature = other.temperature;
    }
    if other.max_tokens.is_some() {
        self.max_tokens = other.max_tokens;
    }
    // ... same for all fields
}

This allows plugins to override specific parameters without affecting others. A cost-control plugin can set max_tokens while a quality plugin sets temperature, and neither interferes with the other. If both set the same field, the last merge wins.

Tool Intercept Priority

During BeforeToolExecute, plugins can schedule ToolInterceptAction to control tool execution flow. The action payload is one of three variants:

pub enum ToolInterceptPayload {
    Block { reason: String },
    Suspend(SuspendTicket),
    SetResult(ToolResult),
}

When multiple intercepts are scheduled for the same tool call, they are resolved by implicit priority:

PriorityVariantBehavior
3 (highest)BlockTerminate tool execution, fail the call
2SuspendPause execution, wait for external decision
1 (lowest)SetResultShort-circuit with a predefined result

The highest-priority intercept wins. If two intercepts have the same priority (e.g., two plugins both schedule Block), the first one processed takes effect and the conflict is logged as an error.

If no intercept is scheduled, the tool executes normally (implicit proceed).

flowchart TD
    BTE[BeforeToolExecute hooks run] --> INT{Any intercepts scheduled?}
    INT -- No --> EXEC[Execute tool normally]
    INT -- Yes --> PRI[Resolve by priority]
    PRI --> B[Block: fail the call]
    PRI --> S[Suspend: wait for decision]
    PRI --> R[SetResult: return predefined result]

See Also

Design Tradeoffs

This page documents key architectural decisions in Awaken and the tradeoffs they entail.

Snapshot Isolation vs Mutable State

Decision: Phase hooks read from an immutable Snapshot and write to a MutationBatch. Mutations are applied atomically after all hooks converge.

Alternative: Hooks mutate shared state directly (protected by locks or sequential execution).

Snapshot IsolationMutable State
CorrectnessHooks see consistent state regardless of execution orderResult depends on hook ordering and lock granularity
ConcurrencyHooks can run in parallel without data racesRequires careful lock management or forced sequencing
ComplexityRequires MutationBatch machinery, conflict detection, merge strategiesSimpler implementation, direct reads and writes
DebuggabilityEach phase boundary is a clean state transitionState changes interleaved, harder to trace
CostExtra Arc clone per phase for snapshot creationNo snapshot overhead

Why snapshot isolation: Hook execution order should not affect correctness. When multiple plugins touch state in the same phase, mutable state creates implicit ordering dependencies that are difficult to test and reason about. The snapshot approach makes each phase a pure function from state to mutations, which is easier to verify and replay.

Phase-Based Execution vs Event-Driven

Decision: Execution proceeds through a fixed sequence of phases (RunStart through RunEnd). Plugins register hooks at specific phases.

Alternative: Fully event-driven architecture where plugins subscribe to events and react asynchronously.

Phase-BasedEvent-Driven
PredictabilityDeterministic execution order within each phaseNon-deterministic ordering, race conditions possible
Plugin compositionPlugins interact at well-defined boundariesPlugins interact through shared event bus, implicit coupling
TestabilityPhase sequences are easy to unit testRequires simulating async event flows
FlexibilityAdding behavior between phases requires new phasesNew events can be added freely
PerformanceSequential phase execution adds overheadConcurrent processing possible

Why phase-based: Agent execution has a natural sequential structure (infer, execute tools, check termination). Phases formalize this structure and give plugins guaranteed execution points. Event-driven systems are more flexible but make it harder to reason about the order in which plugins see state and harder to implement features like “modify the inference request before it’s sent.”

Typed State Keys vs Dynamic State

Decision: State keys are Rust types implementing the StateKey trait with associated Value and Update types.

Alternative: Untyped key-value store (HashMap<String, Value>).

Typed KeysDynamic State
Type safetyCompile-time guarantees on value and update typesRuntime type errors
Merge semanticsMergeStrategy declared per key at compile timeMerge logic must be external or convention-based
DiscoverabilityKeys are types – IDE navigation, documentationKeys are strings – grep-based discovery
BoilerplateEach key requires a type definitionJust use a string
ExtensibilityNew keys require code changes and recompilationNew keys can be added dynamically at runtime

Why typed keys: State correctness is critical in an agent runtime. A mistyped key or wrong value type causes subtle bugs that surface only during execution. Typed keys catch these at compile time. The apply function makes update semantics explicit – there is no ambiguity about how a counter is incremented or how a list is appended to.

Plugin System vs Middleware Chain

Decision: Plugins register through PluginRegistrar and declare hooks, state keys, tools, and effect handlers as separate registrations.

Alternative: Middleware chain where each layer wraps the next (like tower middleware or HTTP middleware stacks).

Plugin SystemMiddleware Chain
GranularityHooks at specific phases, tools, state keys, effects – each registered independentlyEach middleware wraps the entire execution
CompositionMultiple plugins contribute hooks to the same phaseMiddleware ordering determines behavior
Selective activationactive_hook_filter can enable/disable specific plugins per agentMust restructure the chain to skip middleware
ComplexityMore registration ceremonySimpler mental model (wrap and delegate)
Cross-cutting concernsNatural fit – each plugin handles one concernEach middleware handles one concern but sees all traffic

Why plugin system: Agent execution has many extension points that don’t nest cleanly. A permission check happens at BeforeToolExecute, observability spans wrap tool execution, reminders inject messages at AfterToolExecute. These are independent concerns at different phases. A middleware chain would require each middleware to understand the full lifecycle and decide when to act. The plugin system lets each plugin declare exactly which phases it cares about.

Multi-Protocol Server vs Single Protocol

Decision: The server exposes AI SDK v6, AG-UI, A2A, and MCP over HTTP, while ACP exists as a separate stdio protocol module. Each protocol adapter translates AgentEvent into its wire format.

Alternative: Support a single canonical protocol and require clients to adapt.

Multi-ProtocolSingle Protocol
Frontend compatibilityWorks with AI SDK, CopilotKit, A2A clients, and MCP HTTP clients out of the boxRequires custom adapter on the client side
MaintenanceEach protocol adapter must be kept in sync with AgentEvent changesOne adapter to maintain
TestingProtocol parity tests ensure all adapters handle all eventsLess test surface
ComplexityMultiple route sets, encoder types, and event mappingsOne route set, one encoder
Runtime couplingRuntime is protocol-independent – only emits AgentEventRuntime could be coupled to the protocol

Why multi-protocol: The AI agent ecosystem has not converged on a single protocol. AI SDK, AG-UI, and A2A serve different use cases (chat frontends, copilot UIs, agent-to-agent communication). Supporting multiple protocols at the server layer avoids forcing protocol choices on users. The Transcoder trait and stateful encoders keep the runtime decoupled – adding a new protocol means implementing one encoder, not modifying the execution engine.

See Also

Glossary

Term中文Description
Thread会话线程Persisted conversation + state history.
Run运行One execution attempt over a thread.
Phase阶段Named step in the execution loop (RunStart, StepStart, BeforeInference, AfterInference, BeforeToolExecute, AfterToolExecute, StepEnd, RunEnd).
Snapshot快照Immutable state snapshot (struct Snapshot { revision: u64, ext: Arc<StateMap> }) seen by phase hooks.
StateKey状态键Typed key with scope, merge strategy, and value type.
MutationBatch变更批次Collected state mutations applied atomically after phase convergence.
AgentRuntime智能体运行时Orchestration layer for agent resolution and run execution.
AgentSpec智能体规约Serializable agent definition with model, prompt, tools, plugins.
AgentEvent智能体事件Canonical runtime stream event.
Plugin插件System-level lifecycle hook registered via PluginRegistrar.
PluginRegistrar插件注册器Registration API for phase hooks, tools, state keys, handlers.
PhaseHook阶段钩子Async hook executed during a specific phase.
PhaseContext阶段上下文Immutable context passed to phase hooks.
StateCommand状态命令Result of a phase hook: mutations + scheduled actions + effects.
Tool工具User-facing capability with descriptor, validation, and execution.
ToolDescriptor工具描述符Tool metadata: id, name, description, parameters schema.
ToolResult工具结果Tool execution result: success, error, or suspended.
ToolCallContext工具调用上下文Execution context with state access, identity, and activity reporting.
TerminationReason终止原因Why a run ended (NaturalEnd, Stopped, Error, etc.).
SuspendTicket挂起票据Suspension payload with pending call and resume mode.
MailboxJob邮箱任务Durable job entry for async/HITL workflows.
RunRequest运行请求Input to start a run: messages, thread ID, agent ID.
MergeStrategy合并策略How parallel state mutations are reconciled: Exclusive (conflict = error) or Commutative (order-independent).
KeyScope键作用域Lifetime of a state key: Run (per-execution) or Thread (persisted across runs).
StateMap状态映射Type-safe heterogeneous map backing Snapshot.
RunStatus运行状态Coarse run status: Running, Waiting, Done.
ToolCallStatus工具调用状态Per-call status: New, Running, Suspended, Resuming, Succeeded, Failed, Cancelled.
ResolvedAgent已解析智能体Agent fully resolved from registries with config, tools, plugins, and executor.
AgentResolver智能体解析器Trait that resolves an agent spec ID into a ResolvedAgent.
BuildError构建错误Error from AgentRuntimeBuilder when validation or wiring fails.
RuntimeError运行时错误Error during agent loop execution.
InferenceOverride推理覆盖Per-request override for model, temperature, max_tokens, reasoning_effort.
ContextWindowPolicy上下文窗口策略Token budget and compaction rules for context management.
StreamEvent流事件Envelope wrapping AgentEvent with sequence number and timestamp.
TokenUsage令牌用量Token consumption report from LLM inference.
ExecutionEnv执行环境Assembled runtime environment holding hooks, handlers, tools, and plugins.
CommitHook提交钩子Hook invoked when state is committed to storage.

FAQ

Which LLM providers are supported?

Any provider compatible with genai. This includes OpenAI, Anthropic, DeepSeek, Google Gemini, Ollama, and others. Configure the provider via model ID string in AgentSpec or AgentConfig. The GenaiExecutor handles provider routing based on the model prefix.

How do I add a new storage backend?

Implement the ThreadRunStore trait from awaken-contract. The trait requires methods for loading and saving threads, runs, and checkpoints. See InMemoryStore and FileStore in awaken-stores for reference implementations. Pass your store to AgentRuntime::new().with_thread_run_store(store) or AgentRuntimeBuilder::new().with_thread_run_store(store).

Can I use awaken without the server?

Yes. AgentRuntime is a standalone library type. Create a runtime, build a RunRequest, and call runtime.run(request, sink) directly. The server crate (awaken-server) is an optional HTTP/SSE gateway layered on top.

How do I run multiple agents?

Two approaches:

  • Delegates: Define delegate agent IDs in the parent AgentSpec. The runtime handles handoff via ActiveAgentIdKey at step boundaries.
  • A2A protocol: Register remote agents via AgentRuntimeBuilder::with_remote_agents(). Remote agents are discovered and invoked over HTTP using the Agent-to-Agent protocol.

What is the difference between Run scope and Thread scope?

  • Run scope: State exists only for the duration of a single run. Cleared when the run ends. Use for transient data like step counters, token budgets, and per-run configuration.
  • Thread scope: State persists across runs within the same thread. Use for conversation memory, user preferences, and accumulated context.

Scope is declared when defining a StateKey.

How do I handle tool errors?

Return ToolResult::error(tool_name, message) from your tool’s execute method. The runtime writes the error result back to the conversation as a tool response message and continues the inference loop. The LLM sees the error and can retry or adjust its approach. For fatal errors that should stop the run, return a ToolError instead.

Can tools run in parallel?

Yes. Configure ToolExecutionMode in the agent spec. When set to parallel mode, the runtime executes independent tool calls concurrently. Results are collected and merged before proceeding to the next inference step.

How do I debug a run that is stuck?

Check RunStatus in state (__runtime.run_lifecycle key). If Waiting, look at __runtime.tool_call_states for pending decisions. If Running, check if max_rounds or timeout was hit. Enable observability plugin to get per-phase tracing.

How do I test without a real LLM?

Implement LlmExecutor with canned responses. See Testing Strategy for patterns.

What happens when parallel tools write to the same state key?

If the key uses MergeStrategy::Exclusive, the merge fails and hooks fall back to serial execution. Use MergeStrategy::Commutative for keys that support concurrent writes. See State and Snapshot Model.

How do request transforms work?

Plugins register InferenceRequestTransform via the registrar. Transforms modify the inference request (system prompt, tools, parameters) before it reaches the LLM. Only active plugins’ transforms apply. See Plugin Internals.

Can I write a custom storage backend?

Yes. Implement ThreadRunStore (for state persistence) and optionally MailboxStore (for HITL). See trait definitions in awaken-stores. File and PostgreSQL implementations serve as references.

How does context compaction work?

When autocompact_threshold: Option<usize> is set in ContextWindowPolicy, the CompactionPlugin monitors token usage. When the context exceeds that threshold, it finds a safe compaction boundary (where all tool call/result pairs are complete), summarizes older messages via LLM, and replaces them with a <conversation-summary> message. See Optimize Context Window.

How do I choose between AI SDK v6, AG-UI, and A2A protocols?

  • AI SDK v6: Best for React frontends using Vercel AI SDK. Supports text streaming, tool calls, and state snapshots.
  • AG-UI: Best for CopilotKit frontends. Supports generative UI components and agent collaboration.
  • A2A: Best for agent-to-agent communication. Used for delegate agents and inter-service orchestration.

All three run over HTTP/SSE. Choose based on your frontend framework.

Migrating from Tirea

Awaken is a ground-up rewrite of tirea, not an incremental upgrade. This guide maps tirea 0.5 concepts to their awaken equivalents for developers porting existing agents.

The tirea 0.5 source is archived on the tirea-0.5 branch.

Crate Mapping

tirea 0.5awakenNotes
tirea-stateawaken-contractState engine merged into contract crate
tirea-state-deriveRemoved; use StateKey trait directly
tirea-contractawaken-contractMerged types + traits into one crate
tirea-agentosawaken-runtimeRenamed; same role (execution engine)
tirea-store-adaptersawaken-storesRenamed
tirea-agentos-serverawaken-serverRenamed; protocols now inline
tirea-protocol-ai-sdk-v6awaken-server::protocols::ai_sdk_v6Merged into server crate
tirea-protocol-ag-uiawaken-server::protocols::ag_uiMerged into server crate
tirea-protocol-acpawaken-server::protocols::acpMerged into server crate
tirea-extension-permissionawaken-ext-permissionRenamed
tirea-extension-observabilityawaken-ext-observabilityRenamed
tirea-extension-skillsawaken-ext-skillsRenamed
tirea-extension-mcpawaken-ext-mcpRenamed
tirea-extension-handoffawaken-runtime::extensions::handoffMerged into runtime
tirea-extension-a2uiawaken-ext-generative-uiRenamed
tirea (facade)awakenRenamed

Key structural change: tirea had 16 crates; awaken has 13. Protocol crates were merged into awaken-server, the state derive macro was removed, and tirea-state + tirea-contract were unified into awaken-contract.

State Model

tirea used a derive-macro-based state system. Awaken uses a trait-based approach.

// tirea 0.5
#[derive(StateSlot)]
#[state(key = "my_plugin.counter", scope = "run")]
struct Counter(u32);

// awaken
use awaken::prelude::*;

pub struct Counter;

impl StateKey for Counter {
    const KEY: &'static str = "my_plugin.counter";
    const MERGE: MergeStrategy = MergeStrategy::Exclusive;
    type Value = u32;
    type Update = u32;

    fn apply(value: &mut u32, update: u32) {
        *value = update;
    }
}
tirea conceptawaken equivalent
StateSlot deriveStateKey trait impl
ScopeDomain::RunKeyScope::Run (default)
ScopeDomain::ThreadKeyScope::Thread
ScopeDomain::GlobalRemoved (use KeyScope::Thread + ProfileStore)
state.get::<T>()snapshot.get::<T>() (via ctx.state::<T>() in hooks)
state.set(value)cmd.update::<T>(value) (via StateCommand)
Mutable state accessImmutable snapshots + StateCommand returns

Key change: State is never mutated directly. Hooks read from immutable snapshots and return StateCommand with updates. The runtime applies all updates atomically after the gather phase.

Action System

tirea used typed action enums per phase. Awaken uses ScheduledActionSpec with handlers.

// tirea 0.5
BeforeInferenceAction::AddContextMessage(
    ContextMessage::system("key", "content")
)

// awaken
use awaken::prelude::*;

let mut cmd = StateCommand::new();
cmd.schedule_action::<AddContextMessage>(
    ContextMessage::system("key", "content"),
)?;
tirea actionawaken actionNotes
BeforeInferenceAction::AddContextMessageAddContextMessageSame semantics
BeforeInferenceAction::SetInferenceOverrideSetInferenceOverrideSame semantics
BeforeInferenceAction::ExcludeToolExcludeToolSame semantics
BeforeToolExecuteAction::BlockToolInterceptAction with Block payloadUnified intercept
BeforeToolExecuteAction::SuspendToolInterceptAction with Suspend payloadUnified intercept
AfterToolExecuteAction::AddMessageAddContextMessageGeneralized

See Scheduled Actions for the full list.

Plugin Trait

// tirea 0.5
impl Extension for MyPlugin {
    fn name(&self) -> &str { "my_plugin" }
    fn register(&self, ctx: &mut ExtensionContext) -> Result<()> {
        ctx.add_hook(Phase::BeforeInference, MyHook);
        Ok(())
    }
}

// awaken
impl Plugin for MyPlugin {
    fn descriptor(&self) -> PluginDescriptor {
        PluginDescriptor { name: "my_plugin" }
    }
    fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
        r.register_phase_hook("my_plugin", Phase::BeforeInference, MyHook)?;
        r.register_key::<MyState>(StateKeyOptions::default())?;
        Ok(())
    }
}
tireaawakenNotes
Extension traitPlugin traitRenamed
ExtensionContextPluginRegistrarMore structured; explicit key/tool/hook registration
ctx.add_hook()r.register_phase_hook()Requires plugin_id parameter
r.register_key::<T>()New: state keys must be declared at registration
r.register_tool()New: plugin-scoped tools
r.register_scheduled_action::<A, _>(handler)New: custom action handlers

Tool Trait

// tirea 0.5
#[async_trait]
impl TypedTool for MyTool {
    type Args = MyArgs;
    async fn call(&self, args: Self::Args, ctx: ToolContext) -> ToolOutput {
        ToolOutput::success(json!({"result": 42}))
    }
}

// awaken
#[async_trait]
impl Tool for MyTool {
    fn descriptor(&self) -> ToolDescriptor { ... }
    async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
        Ok(ToolResult::success("my_tool", json!({"result": 42})).into())
    }
}
tireaawakenNotes
TypedTool with type ArgsTool with Value argsNo generic Args; validate in execute()
ToolOutput (direct return)ToolOutput (result + optional StateCommand)New: tools can schedule actions
ToolContextToolCallContextRenamed; adds state(), profile_access()

Runtime Builder

// tirea 0.5
let runtime = AgentOsBuilder::new()
    .agent(agent_spec)
    .tool("search", SearchTool)
    .extension(PermissionExtension::new(rules))
    .build()?;

// awaken
let runtime = AgentRuntimeBuilder::new()
    .with_agent_spec(agent_spec)
    .with_tool("search", Arc::new(SearchTool))
    .with_plugin("permissions", Arc::new(PermissionPlugin::new(rules)))
    .build()?;
tireaawakenNotes
AgentOsBuilderAgentRuntimeBuilderRenamed
.agent(spec).with_agent_spec(spec)Consistent with_ prefix
.tool(id, impl).with_tool(id, Arc<dyn Tool>)Explicit Arc wrapping
.extension(ext).with_plugin(id, Arc<dyn Plugin>)Renamed; requires ID
.build().build()Same

Concepts Removed

The following tirea concepts have no awaken equivalent:

tirea conceptReason removed
StateSlot derive macroTrait impl is simpler and doesn’t require proc-macro
Global scopeThread scope + ProfileStore covers this
RuntimeEffectReplaced by StateCommand effects
EffectLog / ScheduledActionLogReplaced by tracing
ConfigStore / ConfigSlotReplaced by AgentSpec sections
AgentProfileMerged into AgentSpec
ExtensionContext live activationReplaced by static plugin_ids on AgentSpec

Concepts Added

awaken conceptDescription
PluginRegistrarStructured registration (keys, tools, hooks, actions)
ToolOutput with StateCommandTools can schedule actions as side-effects
ToolInterceptActionUnified Block/Suspend/SetResult pipeline
CircuitBreakerPer-model LLM failure protection
MailboxDurable job queue with lease-based claim
EventReplayBufferSSE reconnection with frame replay
DeferredToolsPluginLazy tool loading with probability model
ProfileStoreCross-session persistent state

Quick Checklist

  • Replace tirea dependency with awaken in Cargo.toml
  • Replace use tirea::* with use awaken::prelude::*
  • Convert #[derive(StateSlot)] to impl StateKey for ...
  • Convert Extension impls to Plugin impls with PluginRegistrar
  • Convert TypedTool impls to Tool impls
  • Replace action enum variants with cmd.schedule_action::<ActionType>(...)
  • Replace AgentOsBuilder with AgentRuntimeBuilder
  • Update store imports: tirea_store_adaptersawaken_stores
  • Update server imports: protocol crates → awaken_server::protocols::*