Introduction
Awaken is a modular AI agent runtime framework built in Rust. It provides phase-based execution with snapshot isolation and deterministic replay, a typed state engine with key scoping (thread / run) and merge strategies (exclusive / commutative), a plugin lifecycle system for extensibility, and a multi-protocol server surface supporting AI SDK v6, AG-UI, A2A, and MCP over HTTP and stdio, plus ACP over stdio.
Crate Overview
| Crate | Description |
|---|---|
awaken-contract | Core contracts: types, traits, state model, agent specs |
awaken-runtime | Execution engine: phase loop, plugin system, agent loop, builder |
awaken-server | HTTP/SSE gateway with protocol adapters |
awaken-stores | Storage backends: memory, file, postgres |
awaken-tool-pattern | Glob/regex tool matching for permission and reminder rules |
awaken-ext-permission | Permission plugin with allow/deny/ask policies |
awaken-ext-observability | OpenTelemetry-based LLM and tool call tracing |
awaken-ext-mcp | Model Context Protocol client integration |
awaken-ext-skills | Skill package discovery and activation |
awaken-ext-reminder | Declarative reminder rules triggered after tool execution |
awaken-ext-generative-ui | Declarative UI components (A2UI protocol) |
awaken-ext-deferred-tools | Deferred tool loading with probabilistic promotion |
awaken | Facade crate that re-exports core modules |
Architecture
Application code
registers tools / models / providers / plugins / agent specs
|
v
AgentRuntime
resolves AgentSpec -> ResolvedAgent
builds ExecutionEnv from plugins
runs the phase loop and exposes cancel / decision control
|
v
Server + storage surfaces
HTTP routes, SSE replay, mailbox, protocol adapters, thread/run persistence
Core Principle
All state access follows snapshot isolation. Phase hooks see an immutable snapshot; mutations are collected in a MutationBatch and applied atomically after convergence.
What’s in This Book
- Get Started — build a working mental model with the smallest runnable flows
- Build Agents — add tools, plugins, MCP, skills, reminders, handoff, and UI capabilities
- Serve & Integrate — expose HTTP endpoints and wire AI SDK or CopilotKit frontends
- State & Storage — choose persistence, context shaping, and state lookup patterns
- Operate — harden runtime behavior with observability, permissions, progress reporting, and tests
- Reference — API, protocol, config, and schema lookup pages
- Architecture — runtime layering, phase execution, and design tradeoffs
Recommended Reading Path
If you are new to the repository, use this order:
- Start with Get Started and complete First Agent.
- Move to Build Agents when you are ready to add tools and plugins.
- Use Serve & Integrate when the runtime needs to talk to HTTP clients or frontends.
- Use State & Storage and Operate as you move from demos to production behavior.
- Keep Reference Overview and Architecture open when you need exact contracts or runtime internals.
Repository Map
These paths matter most when you move from docs into code:
| Path | Purpose |
|---|---|
crates/awaken-contract/ | Core contracts: tools, events, state interfaces |
crates/awaken-runtime/ | Agent runtime: execution engine, plugins, builder |
crates/awaken-server/ | HTTP/SSE server surfaces |
crates/awaken-stores/ | Storage backends |
crates/awaken/examples/ | Small runtime examples |
examples/src/ | Full-stack server examples |
docs/book/src/ | This documentation source |
Get Started
Use this path if you are new to Awaken and want a working mental model before wiring production features.
Read in order
- First Agent for the smallest runnable runtime.
- First Tool to understand tool schemas, execution, and state writes.
- Build an Agent when you want a reusable project baseline.
- Tool Trait before writing production tools.
Leave this path when
- You need more agent capabilities: go to Build Agents.
- You need HTTP or frontend integration: go to Serve & Integrate.
- You need persistence or operational controls: go to State & Storage or Operate.
First Agent
Goal
Run one agent end-to-end and confirm you receive a complete event stream.
Prerequisites
[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
Set one model provider key before running:
# OpenAI-compatible models (for gpt-4o-mini)
export OPENAI_API_KEY=<your-key>
# Or DeepSeek models
export DEEPSEEK_API_KEY=<your-key>
1. Create src/main.rs
use std::sync::Arc;
use serde_json::{json, Value};
use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::contract::message::Message;
use awaken::contract::event::AgentEvent;
use awaken::contract::event_sink::VecEventSink;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::AgentSpec;
use awaken::registry::ModelEntry;
use awaken::{AgentRuntimeBuilder, RunRequest};
struct EchoTool;
#[async_trait]
impl Tool for EchoTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("echo", "Echo", "Echo input back to the caller")
.with_parameters(json!({
"type": "object",
"properties": { "text": { "type": "string" } },
"required": ["text"]
}))
}
async fn execute(
&self,
args: Value,
_ctx: &ToolCallContext,
) -> Result<ToolOutput, ToolError> {
let text = args["text"].as_str().unwrap_or_default();
Ok(ToolResult::success("echo", json!({ "echoed": text })).into())
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent_spec = AgentSpec::new("assistant")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant. Use the echo tool when asked.")
.with_max_rounds(5);
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(agent_spec)
.with_tool("echo", Arc::new(EchoTool))
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model("gpt-4o-mini", ModelEntry {
provider: "openai".into(),
model_name: "gpt-4o-mini".into(),
})
.build()?;
let request = RunRequest::new(
"thread-1",
vec![Message::user("Say hello using the echo tool")],
)
.with_agent_id("assistant");
let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
let events = sink.take();
println!("events: {}", events.len());
let finished = events
.iter()
.any(|e| matches!(e, AgentEvent::RunFinish { .. }));
println!("run_finish_seen: {}", finished);
Ok(())
}
2. Run
cargo run
3. Verify
Expected output includes:
events: <n>wheren > 0run_finish_seen: true
The event stream will contain at least RunStart, one or more TextDelta or ToolCallStart/ToolCallDone events, and a final RunFinish.
What You Created
This example creates an in-process AgentRuntime and runs one request immediately.
The core object is:
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(agent_spec)
.with_tool("echo", Arc::new(EchoTool))
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model("gpt-4o-mini", ModelEntry {
provider: "openai".into(),
model_name: "gpt-4o-mini".into(),
})
.build()?;
After that, the normal entry point is:
let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
let events = sink.take();
Common usage patterns:
- one-shot CLI program: construct
RunRequest, collect events viaVecEventSink, print result - application service: wrap
runtime.run(...)inside your own app logic - HTTP server: store
Arc<AgentRuntime>in app state and expose protocol routes
Which Doc To Read Next
Use the next page based on what you want:
- add typed state and stateful tools: First Tool
- learn how events map to the agent loop: Events
- expose the agent over HTTP: Expose HTTP SSE
Common Errors
- Model/provider mismatch:
gpt-4o-minirequires a compatible OpenAI-style provider setup. - Missing key: set
OPENAI_API_KEYorDEEPSEEK_API_KEYbeforecargo run. - Tool not selected: ensure the prompt explicitly asks to use
echo. - No
RunFinishevent: check thatwith_max_roundsis set high enough for the model to complete.
Next
First Tool
Goal
Implement one tool that reads typed state from ToolCallContext during execution.
State is optional. Many tools (API calls, search, shell commands) don’t need state – just implement
executeand return aToolResult.
Prerequisites
- Complete First Agent first.
- Reuse the runtime dependencies from First Agent.
[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
1. Define a StateKey
A StateKey describes one named slot in the state map. It declares the value type, how updates are applied, and the lifetime scope.
use awaken::{StateKey, KeyScope, MergeStrategy};
/// Tracks how many times the greeting tool has been called.
struct GreetCount;
impl StateKey for GreetCount {
const KEY: &'static str = "greet_count";
const MERGE: MergeStrategy = MergeStrategy::Commutative;
const SCOPE: KeyScope = KeyScope::Run;
type Value = u32;
type Update = u32;
fn apply(value: &mut Self::Value, update: Self::Update) {
*value += update;
}
}
fn main() {}
Key choices:
KeyScope::Run– state resets at the start of each run. UseKeyScope::Threadto persist across runs.MergeStrategy::Commutative– safe for concurrent updates. UseExclusivewhen only one writer is expected.applydefines how anUpdatemodifies the currentValue. Here it increments.
2. Implement the Tool
The tool reads the current count via ctx.state::<GreetCount>() and returns a personalized greeting.
use awaken::{StateKey, KeyScope, MergeStrategy};
struct GreetCount;
impl StateKey for GreetCount {
const KEY: &'static str = "greet_count";
const MERGE: MergeStrategy = MergeStrategy::Commutative;
const SCOPE: KeyScope = KeyScope::Run;
type Value = u32;
type Update = u32;
fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("greet", "Greet", "Greet a user by name")
.with_parameters(json!({
"type": "object",
"properties": {
"name": { "type": "string", "description": "Name to greet" }
},
"required": ["name"]
}))
}
fn validate_args(&self, args: &Value) -> Result<(), ToolError> {
args["name"]
.as_str()
.filter(|s| !s.is_empty())
.ok_or_else(|| ToolError::InvalidArguments("name is required".into()))?;
Ok(())
}
async fn execute(
&self,
args: Value,
ctx: &ToolCallContext,
) -> Result<ToolOutput, ToolError> {
let name = args["name"].as_str().unwrap_or("world");
// Read state -- returns None if the key has not been set yet.
let count = ctx.state::<GreetCount>().copied().unwrap_or(0);
Ok(ToolResult::success("greet", json!({
"greeting": format!("Hello, {}!", name),
"times_greeted": count,
})).into())
}
}
fn main() {}
3. Register the Tool
use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::{StateKey, KeyScope, MergeStrategy};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
struct GreetCount;
impl StateKey for GreetCount {
const KEY: &'static str = "greet_count";
const MERGE: MergeStrategy = MergeStrategy::Commutative;
const SCOPE: KeyScope = KeyScope::Run;
type Value = u32;
type Update = u32;
fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("greet", "Greet", "Greet a user by name")
}
async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
Ok(ToolResult::success("greet", json!({})).into())
}
}
use awaken::registry_spec::AgentSpec;
use awaken::AgentRuntimeBuilder;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent_spec = AgentSpec::new("assistant")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant. Use the greet tool when asked.")
.with_max_rounds(5);
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(agent_spec)
.with_tool("greet", Arc::new(GreetTool))
.build()?;
Ok(())
}
4. Run
use std::sync::Arc;
use async_trait::async_trait;
use serde_json::{json, Value};
use awaken::{StateKey, KeyScope, MergeStrategy};
use awaken::contract::tool::{Tool, ToolDescriptor, ToolResult, ToolOutput, ToolError, ToolCallContext};
use awaken::registry_spec::AgentSpec;
use awaken::AgentRuntimeBuilder;
struct GreetCount;
impl StateKey for GreetCount {
const KEY: &'static str = "greet_count";
const MERGE: MergeStrategy = MergeStrategy::Commutative;
const SCOPE: KeyScope = KeyScope::Run;
type Value = u32;
type Update = u32;
fn apply(value: &mut Self::Value, update: Self::Update) { *value += update; }
}
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("greet", "Greet", "Greet a user by name")
}
async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
Ok(ToolResult::success("greet", json!({})).into())
}
}
use awaken::contract::message::Message;
use awaken::contract::event_sink::VecEventSink;
use awaken::RunRequest;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(AgentSpec::new("assistant").with_model("gpt-4o-mini"))
.with_tool("greet", Arc::new(GreetTool))
.build()?;
let request = RunRequest::new(
"thread-1",
vec![Message::user("Greet Alice")],
)
.with_agent_id("assistant");
let sink = Arc::new(VecEventSink::new());
runtime.run(request, sink.clone()).await?;
Ok(())
}
5. Verify
Check the collected events for a ToolCallDone event with name == "greet":
use awaken::contract::event::AgentEvent;
let events = sink.take();
let tool_done = events.iter().any(|e| matches!(
e,
AgentEvent::ToolCallDone { id: _, message_id: _, result: _, outcome: _ }
));
println!("tool_call_done_seen: {}", tool_done);
Expected:
tool_call_done_seen: true- The
resultinsideToolCallDonecontainsgreetingandtimes_greetedfields.
What You Created
A tool that:
- Declares a JSON Schema for its arguments via
descriptor(). - Validates arguments before execution via
validate_args(). - Reads typed state from the snapshot via
ctx.state::<K>(). - Returns structured JSON via
ToolResult::success().
The StateKey trait gives you type-safe, scoped state without raw JSON manipulation.
Which Doc To Read Next
- understand the full tool lifecycle: Tool Trait
- add plugins that manage state across runs: Add a Plugin
- learn about state scoping rules: State Keys
Common Errors
ctx.state::<K>()returnsNone: the state key has not been written yet in this run. Use.unwrap_or_default()or.copied().unwrap_or(0)for numeric defaults.StateError::KeyEncode/StateError::KeyDecode: theValuetype does not round-trip through JSON. EnsureSerializeandDeserializeare derived correctly.ToolError::InvalidArgumentsnot surfaced:validate_argsis called beforeexecuteby the runtime. If you skip validation, bad input reachesexecuteand may panic on.unwrap().- Scope mismatch:
KeyScope::Runstate is cleared between runs. If you expect persistence, useKeyScope::Thread.
Build Agents
This path is for composing agent behavior after you understand the basics.
Recommended order
- Build an Agent to define the runtime, model registry, and agent spec.
- Add a Tool and Add a Plugin to extend behavior safely.
- Add discovery and delegation layers with Use Skills Subsystem, Use MCP Tools, and Use Agent Handoff.
- Add specialized behavior with Use Reminder Plugin, Use Generative UI, and Use Deferred Tools.
Keep nearby
- Tool Trait for exact tool contracts.
- Tool and Plugin Boundary for extension design decisions.
- Architecture when you need the full runtime model.
Build an Agent
Use this when you need to assemble an agent with tools, persistence, and a provider into a running AgentRuntime.
Prerequisites
awakencrate added toCargo.toml- An
LlmExecutorimplementation (provider) available - Familiarity with
AgentSpecandAgentRuntimeBuilder
Steps
- Define the agent spec.
use awaken::engine::GenaiExecutor;
use awaken::{AgentSpec, AgentRuntimeBuilder, ModelSpec};
let spec = AgentSpec::new("assistant")
.with_model("claude-sonnet")
.with_system_prompt("You are a helpful assistant.")
.with_max_rounds(10);
- Register tools.
use std::sync::Arc;
let builder = AgentRuntimeBuilder::new()
.with_agent_spec(spec)
.with_tool("search", Arc::new(SearchTool))
.with_tool("calculator", Arc::new(CalculatorTool));
- Register a provider and a model.
let builder = builder
.with_provider("anthropic", Arc::new(GenaiExecutor::new()))
.with_model("claude-sonnet", ModelSpec {
id: "claude-sonnet".into(),
provider: "anthropic".into(),
model: "claude-sonnet-4-20250514".into(),
});
- Attach persistence.
use awaken::stores::InMemoryStore;
let store = Arc::new(InMemoryStore::new());
let builder = builder.with_thread_run_store(store);
- Build and validate.
let runtime = builder.build()?;
build resolves every registered agent and catches missing models, providers, or plugins at startup rather than at request time.
- Execute a run.
use std::sync::Arc;
use awaken::RunRequest;
use awaken::contract::event_sink::VecEventSink;
let request = RunRequest::new("thread-1", vec![user_message])
.with_agent_id("assistant");
let sink = Arc::new(VecEventSink::new());
let handle = runtime.run(request, sink.clone()).await?;
Verify
Call the /health endpoint (if using the server feature) or inspect the returned AgentRunResult to confirm the agent loop completed without errors.
Common Errors
| Error | Cause | Fix |
|---|---|---|
BuildError::ValidationFailed | Agent spec references a model or provider not registered in the builder | Register the missing model/provider before calling build |
BuildError::State | Duplicate state key registration across plugins | Ensure each StateKey is registered by exactly one plugin |
RuntimeError at run time | Provider returns an inference error | Check provider credentials and model ID |
Related Example
examples/src/research/ – a research agent with search and report-writing tools.
Key Files
crates/awaken-runtime/src/builder.rs–AgentRuntimeBuildercrates/awaken-contract/src/registry_spec.rs–AgentSpeccrates/awaken-runtime/src/runtime/agent_runtime/mod.rs–AgentRuntime
Related
Add a Tool
Use this when you need to expose a custom capability to the agent by implementing the Tool trait.
Prerequisites
awakencrate added toCargo.tomlasync-traitandserde_jsonavailable
Steps
- Implement the
Tooltrait.
#![allow(unused)]
fn main() {
use async_trait::async_trait;
use serde_json::{Value, json};
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
async fn fetch_weather(_city: &str) -> Result<String, ToolError> {
Ok("Sunny, 22°C".to_string())
}
pub struct WeatherTool;
#[async_trait]
impl Tool for WeatherTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("get_weather", "Get Weather", "Fetch current weather for a city")
.with_parameters(json!({
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name"
}
},
"required": ["city"]
}))
}
async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
let city = args["city"]
.as_str()
.ok_or_else(|| ToolError::InvalidArguments("Missing 'city'".into()))?;
let weather = fetch_weather(city).await?;
Ok(ToolResult::success("get_weather", json!({ "forecast": weather })).into())
}
}
}
- Optionally override argument validation.
fn validate_args(&self, args: &Value) -> Result<(), ToolError> {
if !args.get("city").and_then(|v| v.as_str()).is_some_and(|s| !s.is_empty()) {
return Err(ToolError::InvalidArguments("'city' must be a non-empty string".into()));
}
Ok(())
}
validate_args runs before execute and lets you reject malformed input early.
- Register the tool with the builder.
use std::sync::Arc;
use awaken::AgentRuntimeBuilder;
let runtime = AgentRuntimeBuilder::new()
.with_tool("get_weather", Arc::new(WeatherTool))
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
The string ID passed to with_tool must match the id in ToolDescriptor::new.
-
Register via a plugin (alternative).
Tools can also be registered inside a
Plugin::registermethod through thePluginRegistrar:
fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError> {
registrar.register_tool("get_weather", Arc::new(WeatherTool))?;
Ok(())
}
Plugin-registered tools are scoped to agents that activate that plugin.
Verify
Send a message that should trigger the tool. Inspect the run result to confirm the tool was called and returned the expected output.
Common Errors
| Error | Cause | Fix |
|---|---|---|
ToolError::InvalidArguments | The LLM passed malformed JSON | Tighten the JSON Schema in with_parameters to guide the model |
| Tool never called | Descriptor id does not match the registered ID | Ensure the ID in ToolDescriptor::new and with_tool are identical |
ToolError::ExecutionFailed | Runtime error inside execute | Return a descriptive error; the agent will see it and may retry |
Related Example
examples/src/research/tools.rs – SearchTool and WriteReportTool implementations.
Key Files
crates/awaken-contract/src/contract/tool.rs–Tooltrait,ToolDescriptor,ToolResult,ToolErrorcrates/awaken-runtime/src/builder.rs–with_toolregistration
Related
Add a Plugin
Use this when you need to extend the agent lifecycle with state keys, phase hooks, scheduled actions, or effect handlers.
Prerequisites
awakencrate added toCargo.toml- Familiarity with
Phasevariants andStateKey
Steps
- Define a state key.
#![allow(unused)]
fn main() {
use awaken::{StateKey, MergeStrategy, StateError, JsonValue};
use serde::{Serialize, Deserialize};
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct AuditLog {
pub entries: Vec<String>,
}
pub struct AuditLogKey;
impl StateKey for AuditLogKey {
type Value = AuditLog;
const KEY: &'static str = "audit_log";
const MERGE: MergeStrategy = MergeStrategy::Exclusive;
type Update = AuditLog;
fn apply(value: &mut Self::Value, update: Self::Update) {
*value = update;
}
fn encode(value: &Self::Value) -> Result<JsonValue, StateError> {
serde_json::to_value(value).map_err(|e| StateError::KeyEncode { key: Self::KEY.into(), message: e.to_string() })
}
fn decode(json: JsonValue) -> Result<Self::Value, StateError> {
serde_json::from_value(json).map_err(|e| StateError::KeyDecode { key: Self::KEY.into(), message: e.to_string() })
}
}
}
- Implement a phase hook.
use async_trait::async_trait;
use awaken::{PhaseHook, PhaseContext, StateCommand, StateError};
pub struct AuditHook;
#[async_trait]
impl PhaseHook for AuditHook {
async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
let mut log = ctx.state::<AuditLogKey>().cloned().unwrap_or(AuditLog {
entries: Vec::new(),
});
log.entries.push(format!("Phase executed at {:?}", ctx.phase));
let mut cmd = StateCommand::new();
cmd.update::<AuditLogKey>(log);
Ok(cmd)
}
}
- Implement the Plugin trait.
use awaken::{Plugin, PluginDescriptor, PluginRegistrar, Phase, StateError, StateKeyOptions, KeyScope};
pub struct AuditPlugin;
impl Plugin for AuditPlugin {
fn descriptor(&self) -> PluginDescriptor {
PluginDescriptor { name: "audit" }
}
fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError> {
registrar.register_key::<AuditLogKey>(StateKeyOptions {
scope: KeyScope::Run,
..Default::default()
})?;
registrar.register_phase_hook(
"audit",
Phase::PostInference,
AuditHook,
)?;
Ok(())
}
}
- Register the plugin and activate it on an agent.
use std::sync::Arc;
use awaken::{AgentSpec, AgentRuntimeBuilder};
let spec = AgentSpec::new("assistant")
.with_model("anthropic/claude-sonnet")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("audit");
let runtime = AgentRuntimeBuilder::new()
.with_plugin("audit", Arc::new(AuditPlugin))
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
with_hook_filter on the agent spec activates the named plugin for that agent.
Verify
Run the agent and inspect the state snapshot. The audit_log key should contain entries added by the hook after each inference phase.
Common Errors
| Error | Cause | Fix |
|---|---|---|
StateError::KeyAlreadyRegistered | Two plugins register the same StateKey | Use a unique KEY constant per state key |
StateError::UnknownKey | Accessing a key that was never registered | Ensure the plugin calling register_key is activated on the agent |
| Hook not firing | Plugin ID not listed in with_hook_filter | Add the plugin ID to the agent spec’s hook filters |
Related Example
crates/awaken-ext-observability/ – the built-in observability plugin registers phase hooks and state keys.
Key Files
crates/awaken-runtime/src/plugins/lifecycle.rs–Plugintraitcrates/awaken-runtime/src/plugins/registry.rs–PluginRegistrarcrates/awaken-runtime/src/hooks/phase_hook.rs–PhaseHooktrait
Related
Use Skills Subsystem
Use this when you want agents to discover and activate skill packages at runtime, loading instructions, resources, and scripts on demand.
Prerequisites
- A working awaken agent runtime (see First Agent)
- Feature
skillsenabled on theawakencrate
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["skills"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Steps
- Create skill directories.
Each skill lives in a directory with a SKILL.md file containing YAML frontmatter:
skills/
my-skill/
SKILL.md
another-skill/
SKILL.md
Example skills/my-skill/SKILL.md:
---
name: my-skill
description: Helps with data analysis tasks
allowed-tools: read_file web_search
---
# My Skill
When this skill is active, focus on data analysis.
Use read_file to load datasets and web_search for reference material.
- Discover filesystem skills.
use std::sync::Arc;
use awaken::ext_skills::{FsSkill, InMemorySkillRegistry, SkillSubsystem};
let result = FsSkill::discover("./skills").expect("skill discovery failed");
for warning in &result.warnings {
eprintln!("skill warning: {warning:?}");
}
let skills = FsSkill::into_arc_skills(result.skills);
let registry = Arc::new(InMemorySkillRegistry::from_skills(skills));
-
Use embedded skills (alternative).
For compile-time embedded skills, use
EmbeddedSkill:
use std::sync::Arc;
use awaken::ext_skills::{EmbeddedSkill, EmbeddedSkillData, InMemorySkillRegistry};
const SKILL_MD: &str = "\
---
name: builtin-skill
description: A compiled-in skill
allowed-tools: read_file
---
Builtin Skill
Follow these instructions when active.
";
let skill = EmbeddedSkill::new(&EmbeddedSkillData {
skill_md: SKILL_MD,
references: &[],
assets: &[],
}).expect("valid skill");
let registry = Arc::new(InMemorySkillRegistry::from_skills(vec![Arc::new(skill)]));
- Wire into the runtime.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_skills::SkillSubsystem;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
let subsystem = SkillSubsystem::new(registry);
let agent_spec = AgentSpec::new("skills-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("Discover and activate skills when specialized help is useful.")
.with_hook_filter("skills-discovery")
.with_hook_filter("skills-active-instructions");
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin(
"skills-discovery",
Arc::new(subsystem.discovery_plugin()) as Arc<dyn Plugin>,
)
.with_plugin(
"skills-active-instructions",
Arc::new(subsystem.active_instructions_plugin()) as Arc<dyn Plugin>,
)
.build()
.expect("failed to build runtime");
The SkillDiscoveryPlugin injects a skills catalog into the LLM context before inference and registers three tools:
| Tool | Purpose |
|---|---|
skill | Activate a skill by name |
load_skill_resource | Load a skill resource or reference |
skill_script | Run a skill script |
Verify
- Run the agent and ask it to list available skills.
- The LLM should see the skills catalog in its context and use the
skilltool to activate one. - After activation, the
ActiveSkillInstructionsPlugininjects the skill instructions into subsequent inference calls.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| No skills discovered | Wrong directory path | Ensure each skill has a SKILL.md in a subdirectory |
SkillMaterializeError | Invalid YAML frontmatter | Check that name and description fields are present in SKILL.md |
| Skill tools not available | Plugin not registered | Register both discovery_plugin() and active_instructions_plugin() |
| Feature not found | Missing cargo feature | Enable features = ["skills"] in Cargo.toml |
Related Example
crates/awaken-ext-skills/tests/
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-skills/src/lib.rs | Module root and public re-exports |
crates/awaken-ext-skills/src/registry.rs | FsSkill, InMemorySkillRegistry, SkillRegistry trait |
crates/awaken-ext-skills/src/plugin/subsystem.rs | SkillSubsystem facade |
crates/awaken-ext-skills/src/plugin/discovery.rs | SkillDiscoveryPlugin |
crates/awaken-ext-skills/src/embedded.rs | EmbeddedSkill for compile-time skills |
crates/awaken-ext-skills/src/tools.rs | Skill tool implementations |
Related
Use MCP Tools
Use this when you want to connect to external Model Context Protocol (MCP) servers and expose their tools to awaken agents.
Prerequisites
- A working awaken agent runtime (see First Agent)
- Feature
mcpenabled on theawakencrate - An MCP server to connect to (stdio or HTTP transport)
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["mcp"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Steps
- Configure MCP server connections.
use awaken::ext_mcp::McpServerConnectionConfig;
// Stdio transport: launch a child process
let stdio_config = McpServerConnectionConfig::stdio(
"my-mcp-server", // server name
"node", // command
vec!["server.js".into()], // args
);
// HTTP/SSE transport: connect to a running server
let http_config = McpServerConnectionConfig::http(
"remote-server",
"http://localhost:8080/sse",
);
- Create the registry manager and discover tools.
use awaken::ext_mcp::McpToolRegistryManager;
let manager = McpToolRegistryManager::connect(vec![stdio_config, http_config])
.await
.expect("failed to connect MCP servers");
// Tools are now available as awaken Tool instances.
// Each MCP tool is registered with the ID format: mcp__{server}__{tool}
let registry = manager.registry();
for id in registry.ids() {
println!("discovered: {id}");
}
- Register tools with the runtime.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_mcp::McpPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
let agent_spec = AgentSpec::new("mcp-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("Use MCP tools when they help answer the user.")
.with_hook_filter("mcp");
let mut builder = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin("mcp", Arc::new(McpPlugin) as Arc<dyn Plugin>);
let registry = manager.registry();
for (id, tool) in registry.snapshot() {
builder = builder.with_tool(&id, tool);
}
let runtime = builder.build().expect("failed to build runtime");
-
Enable periodic refresh (optional).
MCP servers may add or remove tools at runtime. Enable periodic refresh to keep the tool registry in sync:
use std::time::Duration;
manager.start_periodic_refresh(Duration::from_secs(60));
Verify
- Run the agent and ask it to use a tool provided by the MCP server.
- Check the backend logs for MCP tool call events.
- Confirm the tool result includes
mcp.serverandmcp.toolmetadata in the response.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
McpError::TransportError | MCP server not running or unreachable | Verify the server process is running and the path/URL is correct |
| No tools discovered | Server returned empty tool list | Check the MCP server implements tools/list |
| Tool call timeout | Server too slow to respond | Increase timeout in the transport configuration |
| Feature not found | Missing cargo feature | Enable features = ["mcp"] in Cargo.toml |
mcp__server__tool not found | Tools not registered with builder | Loop over manager.registry().snapshot() and call with_tool for each |
Related Example
crates/awaken-ext-mcp/tests/
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-mcp/src/lib.rs | Module root and public re-exports |
crates/awaken-ext-mcp/src/manager.rs | McpToolRegistryManager lifecycle and tool wrapping |
crates/awaken-ext-mcp/src/config.rs | McpServerConnectionConfig transport types |
crates/awaken-ext-mcp/src/plugin.rs | McpPlugin integration with awaken plugin system |
crates/awaken-ext-mcp/src/transport.rs | McpToolTransport trait and transport helpers |
crates/awaken-ext-mcp/tests/mcp_tests.rs | Integration tests |
Related
Use the Reminder Plugin
Use this when you want the agent to receive automatic context messages after tool execution based on pattern matching – for example, reminding the agent to run cargo check after editing a .toml file, or warning about destructive commands.
Prerequisites
- A working awaken agent runtime (see Build an Agent)
- Feature
reminderenabled on theawakencrate
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["reminder"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Steps
- Register the reminder plugin with rules.
use std::sync::Arc;
use awaken::{AgentRuntimeBuilder, Plugin};
use awaken::ext_reminder::{ReminderPlugin, ReminderRulesConfig};
let json = r#"{
"rules": [
{
"tool": "Bash(command ~ 'rm *')",
"output": { "status": "success" },
"message": {
"target": "suffix_system",
"content": "A deletion command just succeeded. Verify the result."
}
}
]
}"#;
let config = ReminderRulesConfig::from_str(json, Some("json"))
.expect("failed to parse reminder config");
let rules = config.into_rules().expect("invalid rules");
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(agent_spec)
.with_plugin("reminder", Arc::new(ReminderPlugin::new(rules)) as Arc<dyn Plugin>)
.build()
.expect("failed to build runtime");
The plugin registers an AfterToolExecute phase hook. After every tool call, it evaluates each rule against the tool name, arguments, and result. When a rule matches, it schedules an AddContextMessage action that injects the configured message into the prompt.
-
Define rules with tool patterns.
The
toolfield uses the same pattern DSL as the permission system:
| Pattern | Matches |
|---|---|
"Bash" | Exact tool name Bash |
"*" | Any tool |
"mcp__*" | Glob on tool name (all MCP tools) |
"Bash(command ~ 'rm *')" | Tool name + primary argument glob |
"Edit(file_path ~ '*.toml')" | Named field glob |
let json = r#"{
"rules": [
{
"name": "toml-edit-reminder",
"tool": "Edit(file_path ~ '*.toml')",
"output": "any",
"message": {
"target": "system",
"content": "You edited a TOML file. Run cargo check to verify.",
"cooldown_turns": 3
}
}
]
}"#;
The optional name field gives the rule a human-readable identifier. When omitted, one is auto-generated from the index and tool pattern (e.g. rule-0-Edit(file_path ~ '*.toml')).
-
Configure output matching.
The
outputfield controls whether the rule fires based on the tool result. Set it to"any"to match all outputs, or use a structured object withstatusand/orcontent:
// Match only errors containing "permission denied"
let json = r#"{
"rules": [
{
"tool": "*",
"output": {
"status": "error",
"content": "*permission denied*"
},
"message": {
"target": "suffix_system",
"content": "Permission denied. Consider using sudo or checking file ownership."
}
}
]
}"#;
Status values: "success", "error", "pending", "any".
Content matching supports two forms:
- Text glob (string shorthand):
"*permission denied*"– matches the stringified tool output against a glob pattern. - JSON fields (structured): matches specific fields in JSON output.
// JSON field matching
let json = r#"{
"rules": [
{
"tool": "*",
"output": {
"status": "error",
"content": {
"fields": [
{ "path": "error.code", "op": "exact", "value": "403" }
]
}
},
"message": {
"target": "suffix_system",
"content": "HTTP 403 Forbidden. Check authentication credentials."
}
}
]
}"#;
Field match operations: "glob" (default), "exact", "regex", "not_glob", "not_exact", "not_regex". The path uses dot-separated JSON field names (e.g. "error.code"). When multiple fields are specified, all must match (AND logic).
When both status and content are present, both must match for the rule to fire.
-
Choose a message target.
The
targetfield determines where the injected message appears in the prompt:
| Target | Placement |
|---|---|
"system" | Prepended to the system prompt section |
"suffix_system" | Appended after the system prompt |
"session" | Inserted as a session-scoped system message |
"conversation" | Inserted as a conversation-scoped system message |
-
Use cooldown to avoid repetition.
Set
cooldown_turnson the message to suppress re-injection for a number of turns after firing:
let json = r#"{
"rules": [
{
"tool": "*",
"output": "any",
"message": {
"target": "system",
"content": "Remember to be careful with file operations.",
"cooldown_turns": 5
}
}
]
}"#;
When cooldown_turns is 0 (default), the message is injected on every match.
- Load rules from a file.
use awaken::ext_reminder::ReminderRulesConfig;
let config = ReminderRulesConfig::from_file("reminders.json")
.expect("failed to load reminder config");
let rules = config.into_rules().expect("invalid rules");
- Activate the plugin on an agent spec.
use awaken::registry_spec::AgentSpec;
let agent_spec = AgentSpec::new("my-agent")
.with_model("anthropic/claude-sonnet")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("reminder");
The with_hook_filter("reminder") call activates the reminder plugin’s phase hook for this agent.
Verify
- Configure a rule matching a tool you can easily trigger (e.g.
"*"with"any"output). - Run the agent and invoke a tool.
- Inspect the prompt or enable
tracingat debug level. You should see:reminder rule matched, scheduling context message - Confirm the context message appears in the agent’s next inference prompt at the expected target location.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
InvalidPattern on startup | Malformed tool pattern string | Check syntax against the pattern table above; ensure quotes are escaped in JSON |
InvalidTarget on startup | Unknown message target | Use one of: system, suffix_system, session, conversation |
InvalidOutput on startup | Unrecognized output matcher string | Use "any" or a structured { "status": ..., "content": ... } object |
InvalidOp on startup | Unknown field match operation | Use one of: glob, exact, regex, not_glob, not_exact, not_regex |
| Rule never fires | Plugin not activated on agent | Add with_hook_filter("reminder") to the agent spec |
| Rule fires too often | No cooldown configured | Set cooldown_turns to a positive value |
Related Example
crates/awaken-ext-reminder/src/config.rs– contains test cases with various rule configurations
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-reminder/src/lib.rs | Module root and public re-exports |
crates/awaken-ext-reminder/src/config.rs | ReminderRulesConfig, JSON loading, ReminderConfigKey |
crates/awaken-ext-reminder/src/rule.rs | ReminderRule struct definition |
crates/awaken-ext-reminder/src/output_matcher.rs | OutputMatcher, ContentMatcher, status/content matching logic |
crates/awaken-ext-reminder/src/plugin/plugin.rs | ReminderPlugin registration (AfterToolExecute hook) |
crates/awaken-ext-reminder/src/plugin/hook.rs | ReminderHook – pattern + output evaluation per tool call |
crates/awaken-tool-pattern/ | Shared glob/regex pattern matching library |
Related
- Enable Tool Permission HITL – uses the same tool pattern DSL
- Add a Plugin
- Build an Agent
Use Generative UI (A2UI)
Use this when you want the agent to send declarative UI components to a frontend – for example, rendering a form, a data table, or an interactive card without the frontend knowing the layout in advance.
Prerequisites
- A working awaken agent runtime (see Build an Agent)
- A frontend that consumes A2UI messages from the event stream (e.g. a CopilotKit or AI SDK integration)
- A component catalog registered on the frontend that defines available UI components
[dependencies]
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Steps
- Register the A2UI plugin.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_generative_ui::A2uiPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
let plugin = A2uiPlugin::with_catalog_id("my-catalog");
let agent_spec = AgentSpec::new("ui-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("Render structured UI when visual output helps.")
.with_hook_filter("generative-ui");
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin("generative-ui", Arc::new(plugin) as Arc<dyn Plugin>)
.build()
.expect("failed to build runtime");
The plugin registers a tool called render_a2ui that the LLM can invoke. When the LLM calls this tool with an array of A2UI messages, the tool validates the message structure and returns the validated payload, which flows through the event stream to the frontend.
-
Understand the A2UI protocol.
A2UI v0.8 defines four message types. Each message is a JSON object with
"version": "v0.8"and exactly one of the following keys:
| Message Type | Purpose |
|---|---|
createSurface | Initialize a new rendering surface |
updateComponents | Define or update the component tree |
updateDataModel | Populate or change data values |
deleteSurface | Remove a surface |
Messages must be sent in order: create the surface first, then define components, then populate data.
-
Create a surface.
Every UI starts by creating a surface with a unique ID and a reference to the frontend’s component catalog:
// The LLM sends this via the render_a2ui tool:
let messages = serde_json::json!({
"messages": [
{
"version": "v0.8",
"createSurface": {
"surfaceId": "order-form-1",
"catalogId": "my-catalog"
}
}
]
});
The catalogId tells the frontend which set of component definitions to use for rendering.
-
Define the component tree.
Components are specified as a flat list with adjacency references. Each component has a unique
idand acomponenttype name from the catalog. Parent-child relationships are expressed viachild(single child) orchildren(multiple children):
let messages = serde_json::json!({
"messages": [
{
"version": "v0.8",
"updateComponents": {
"surfaceId": "order-form-1",
"components": [
{
"id": "root",
"component": "Card",
"child": "layout"
},
{
"id": "layout",
"component": "Column",
"children": ["title", "name-field", "submit-btn"]
},
{
"id": "title",
"component": "Text",
"text": "New Order"
},
{
"id": "name-field",
"component": "TextField",
"label": "Customer Name",
"value": { "path": "/customer/name" }
},
{
"id": "submit-btn",
"component": "Button",
"label": "Submit",
"action": {
"event": {
"name": "submit-order",
"context": { "formId": "order-form-1" }
}
}
}
]
}
}
]
});
Rules for the component list:
- One component must have
"id": "root"– this is the tree’s entry point. - Every component requires
"id"and"component"fields. - Use
"child"for a single child reference or"children"for multiple. - Additional properties (
text,label,action, and any extra fields) are passed through to the frontend renderer.
-
Bind data with JSON paths.
Use
{"path": "/json/pointer"}in component properties to bind values to the surface’s data model. The frontend resolves these paths against the data model at render time:
let messages = serde_json::json!({
"messages": [
{
"version": "v0.8",
"updateDataModel": {
"surfaceId": "order-form-1",
"path": "/",
"value": {
"customer": {
"name": "",
"email": ""
},
"items": []
}
}
}
]
});
The path field specifies which part of the data model to update. Use "/" to set the entire model, or a nested path like "/customer/name" to update a specific value.
- Delete a surface.
let messages = serde_json::json!({
"messages": [
{
"version": "v0.8",
"deleteSurface": {
"surfaceId": "order-form-1"
}
}
]
});
-
Send multiple messages in one tool call.
The
render_a2uitool accepts an array of messages, so you can create a surface, define components, and populate data in a single call:
let messages = serde_json::json!({
"messages": [
{
"version": "v0.8",
"createSurface": { "surfaceId": "s1", "catalogId": "my-catalog" }
},
{
"version": "v0.8",
"updateComponents": {
"surfaceId": "s1",
"components": [
{ "id": "root", "component": "Text", "text": "Hello" }
]
}
},
{
"version": "v0.8",
"updateDataModel": { "surfaceId": "s1", "path": "/", "value": {} }
}
]
});
-
Customize plugin instructions.
The plugin injects prompt instructions that teach the LLM how to use the
render_a2uitool. You can customize these in several ways:
// With catalog ID and custom examples appended to the default instructions
let plugin = A2uiPlugin::with_catalog_and_examples(
"my-catalog",
"Example: create a card with a title and a button..."
);
// With fully custom instructions (replaces the default instructions entirely)
let plugin = A2uiPlugin::with_custom_instructions(
"You can render UI by calling render_a2ui...".to_string()
);
Verify
- Register the A2UI plugin and run the agent with a prompt that asks it to display information visually.
- The agent should call the
render_a2uitool with valid A2UI messages. - Check the tool result in the event stream – a successful call returns
{"a2ui": [...], "rendered": true}. - On the frontend, confirm the surface appears with the expected components.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
missing required field "messages" | Tool called without a messages array | Ensure the LLM sends {"messages": [...]} |
messages array must not be empty | Empty messages array | Include at least one A2UI message |
unsupported version | Version field is not "v0.8" | Set "version": "v0.8" on every message |
multiple message types in one object | A single message contains more than one type key | Each message object must have exactly one of createSurface, updateComponents, updateDataModel, or deleteSurface |
components[N].id is required | A component in updateComponents is missing id | Add "id" to every component object |
components[N].component is required | A component is missing the type name | Add "component" with a valid catalog type |
| LLM does not call the tool | Plugin registered but instructions not reaching the LLM | Verify the plugin is activated on the agent spec |
Related Example
crates/awaken-ext-generative-ui/src/a2ui/tests.rs– validation and tool execution test cases
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-generative-ui/src/a2ui/mod.rs | A2UI module root, constants, re-exports |
crates/awaken-ext-generative-ui/src/a2ui/plugin.rs | A2uiPlugin registration and prompt instructions |
crates/awaken-ext-generative-ui/src/a2ui/tool.rs | A2uiRenderTool – validation and execution |
crates/awaken-ext-generative-ui/src/a2ui/types.rs | A2uiMessage, A2uiComponent, and related structs |
crates/awaken-ext-generative-ui/src/a2ui/validation.rs | validate_a2ui_messages structural checks |
Related
Use Agent Handoff
Use this when you need to dynamically switch an agent’s behavior (system prompt, model, or tool set) within a running agent loop without terminating the run or spawning a new thread.
Prerequisites
awakencrate added toCargo.toml- Familiarity with
Plugin,StateKey, andAgentRuntimeBuilder
Overview
Handoff performs dynamic same-thread agent variant switching. Instead of ending the current run and starting a new agent, handoff applies an AgentOverlay that overrides parts of the base agent configuration. The switch is instant – no run termination or re-resolution occurs.
Key types:
HandoffPlugin– the plugin that manages overlay registration and lifecycle hooks.AgentOverlay– per-variant overrides for system prompt, model, and tool filters.HandoffState– tracks the active variant and any pending handoff request.HandoffAction– reducer actions:Request,Activate,Clear.
Steps
- Define agent variant overlays.
Each overlay specifies which parts of the base agent configuration to override. Fields left as None inherit the base agent’s values.
The model field uses the same model registry ID that AgentSpec.model uses.
use awaken::extensions::handoff::AgentOverlay;
let researcher = AgentOverlay {
system_prompt: Some("You are a research specialist. Find and cite sources.".into()),
model: Some("claude-sonnet".into()),
allowed_tools: Some(vec!["web_search".into(), "read_document".into()]),
excluded_tools: None,
};
let writer = AgentOverlay {
system_prompt: Some("You are a technical writer. Produce clear documentation.".into()),
model: None, // inherits base model
allowed_tools: None, // all tools available
excluded_tools: Some(vec!["web_search".into()]),
};
- Build a
HandoffPluginwith variant overlays.
use std::collections::HashMap;
use std::sync::Arc;
use awaken::extensions::handoff::HandoffPlugin;
let mut overlays = HashMap::new();
overlays.insert("researcher".to_string(), researcher);
overlays.insert("writer".to_string(), writer);
let handoff = HandoffPlugin::new(overlays);
- Register the plugin on the runtime builder.
use awaken::AgentRuntimeBuilder;
let runtime = AgentRuntimeBuilder::new()
.with_plugin("agent_handoff", Arc::new(handoff))
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
The plugin ID must be "agent_handoff" (exported as HANDOFF_PLUGIN_ID). The plugin registers hooks on Phase::RunStart and Phase::StepEnd to synchronize handoff state.
- Request a handoff from within a tool or hook.
Use the action helpers to create HandoffAction mutations and dispatch them through a StateCommand:
use awaken::extensions::handoff::{request_handoff, activate_handoff, clear_handoff, ActiveAgentKey};
use awaken::state::StateCommand;
// Request a switch to the "researcher" variant (pending until next phase boundary)
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(request_handoff("researcher"));
// Directly activate a variant (skips the request step)
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(activate_handoff("writer"));
// Clear handoff state and return to the base agent
let mut cmd = StateCommand::new();
cmd.update::<ActiveAgentKey>(clear_handoff());
- Look up overlays from plugin state.
The plugin exposes a method to retrieve the overlay for any registered variant:
let overlay = handoff.overlay("researcher");
// Returns Option<&AgentOverlay>
The effective agent ID is determined by HandoffPlugin::effective_agent, which returns the requested variant if one is pending, otherwise the currently active variant:
let state: &HandoffState = /* from context */;
let agent_id = HandoffPlugin::effective_agent(state);
// Returns Option<&String> -- None means the base agent is active
How It Works
HandoffState has two fields:
active_agent: Option<String>– the currently active variant (None= base agent).requested_agent: Option<String>– a pending handoff request, consumed at the next phase boundary.
The internal HandoffSyncHook runs at RunStart and StepEnd. When it detects a requested_agent, it promotes the request to active_agent and clears the request. This two-phase approach ensures the switch happens at a safe boundary in the agent loop.
Handoff vs Delegation
| Handoff | Delegation | |
|---|---|---|
| Thread | Same thread, same run | Spawns a sub-agent on a separate thread |
| State | Shared – overlays modify the current agent in-place | Isolated – delegate has its own state |
| Use case | Switching personas or tool sets mid-conversation | Offloading a self-contained subtask |
| Overhead | Zero – no run restart | Higher – new run lifecycle |
Use handoff when you want the agent to change behavior while retaining conversational context. Use delegation when the subtask is independent and the delegate should not see or modify the parent’s state.
Common Errors
| Error | Cause | Fix |
|---|---|---|
| Overlay not applied | Variant name in request_handoff does not match a key in the overlays map | Ensure the string matches exactly |
StateError::KeyAlreadyRegistered | Another plugin registers the ActiveAgentKey | Only one HandoffPlugin should be registered per runtime |
| Hook not firing | Plugin not in the agent’s active hook filter | Add "agent_handoff" to active_hook_filter or leave the filter empty |
Key Files
crates/awaken-runtime/src/extensions/handoff/mod.rs– module root and public exportscrates/awaken-runtime/src/extensions/handoff/plugin.rs–HandoffPluginimplementationcrates/awaken-runtime/src/extensions/handoff/types.rs–AgentOverlaystructcrates/awaken-runtime/src/extensions/handoff/state.rs–HandoffStateandActiveAgentKeycrates/awaken-runtime/src/extensions/handoff/action.rs–HandoffActionand helper functions
Related
Use Deferred Tools
Use this when your agent has many tools and you want to reduce context window usage by hiding tool schemas from the LLM until they are needed. The deferred-tools plugin classifies tools as Eager (always sent) or Deferred (hidden until requested). A ToolSearch tool lets the LLM discover deferred tools on demand.
Prerequisites
- A working awaken agent runtime (see First Agent)
- The
awaken-ext-deferred-toolscrate
[dependencies]
awaken-ext-deferred-tools = { version = "0.1" }
awaken = { package = "awaken-agent", version = "0.1" }
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Steps
- Create the plugin and register it.
Collect all tool descriptors your agent exposes, then pass them to DeferredToolsPlugin::new. The plugin uses these to classify tools and populate the deferred registry at activation time.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
use awaken_ext_deferred_tools::DeferredToolsPlugin;
// Collect descriptors from all tools registered on the agent.
let seed_tools = vec![
weather_tool.descriptor(),
search_tool.descriptor(),
debug_tool.descriptor(),
// ... all tool descriptors
];
let agent_spec = AgentSpec::new("deferred-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("Search for tools only when needed.")
.with_hook_filter("ext-deferred-tools");
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin(
"ext-deferred-tools",
Arc::new(DeferredToolsPlugin::new(seed_tools)) as Arc<dyn Plugin>,
)
.build()
.expect("failed to build runtime");
- Configure tool loading rules.
Set rules on the agent spec via the deferred_tools config key. Rules are evaluated in order and first match wins. Tools that match no rule fall back to default_mode.
use awaken_ext_deferred_tools::{
DeferredToolsConfig, DeferralRule, ToolLoadMode,
};
let config = DeferredToolsConfig {
rules: vec![
DeferralRule { tool: "get_weather".into(), mode: ToolLoadMode::Eager },
DeferralRule { tool: "debug_*".into(), mode: ToolLoadMode::Deferred },
],
default_mode: ToolLoadMode::Deferred,
..Default::default()
};
The tool field supports exact names and glob patterns (via wildcard_match). Common patterns:
| Pattern | Matches |
|---|---|
get_weather | Exact tool ID |
debug_* | Any tool starting with debug_ |
mcp__github__* | All GitHub MCP tools |
- Understand auto-enable.
The enabled field on DeferredToolsConfig controls activation:
| Value | Behavior |
|---|---|
Some(true) | Always enable deferred tools |
Some(false) | Always disable |
None (default) | Auto-enable when total token savings across deferred tools exceeds beta_overhead (default 1136 tokens) |
With auto-enable, the plugin estimates per-tool schema cost as len(parameters_json) / 4 tokens and sums the savings for all deferrable tools. If the total exceeds the overhead of maintaining the ToolSearch tool and deferred tool list in context, deferral activates automatically.
- Understand how ToolSearch works.
The plugin automatically registers a ToolSearch tool. The LLM calls it with a query string to find deferred tools:
| Query format | Behavior |
|---|---|
"select:Tool1,Tool2" | Fetch specific tools by exact ID |
"+required rest terms" | Require a keyword, rank by remaining terms |
"plain keywords" | General keyword search across id, name, description |
When ToolSearch returns results, matched tools are promoted to Eager for the remainder of the session. The tool returns up to max_results (default 5) matching tool schemas in a <functions> block.
LLM: I need to check the weather. Let me search for relevant tools.
-> calls ToolSearch { query: "weather forecast" }
ToolSearch returns matching schemas, promotes get_weather to Eager.
Next inference: get_weather schema is included in the tool list.
- Configure automatic re-deferral (DiscBeta).
The plugin tracks per-tool usage statistics using a discounted Beta distribution. Tools that were promoted via ToolSearch but stop being used get automatically re-deferred. This is adaptive and requires no manual tuning.
Re-deferral triggers when all of the following are true:
- The tool is currently Eager (was promoted from Deferred)
- The tool is not configured as always-Eager in rules
- The tool has been idle for at least
defer_afterturns - The posterior upper credible interval (90%) falls below
breakeven_p * thresh_mult
The breakeven frequency is (c - c_bar) / gamma, where c is the full schema cost and c_bar is the name-only cost. A tool is worth keeping eager only if it is used often enough that the savings from avoiding ToolSearch calls exceed the per-turn overhead.
Key parameters in DiscBetaParams:
| Parameter | Default | Purpose |
|---|---|---|
omega | 0.95 | Discount factor per turn. Effective memory is approximately 1/(1-omega) = 20 turns |
n0 | 5.0 | Prior strength in equivalent observations |
defer_after | 5 | Minimum idle turns before considering re-deferral |
thresh_mult | 0.5 | Multiplier on breakeven frequency for the deferral threshold |
gamma | 2000.0 | Estimated token cost of a ToolSearch call |
These live under DeferredToolsConfig.disc_beta:
use awaken_ext_deferred_tools::{DeferredToolsConfig, DiscBetaParams};
let config = DeferredToolsConfig {
disc_beta: DiscBetaParams {
omega: 0.95,
n0: 5.0,
defer_after: 5,
thresh_mult: 0.5,
gamma: 2000.0,
},
beta_overhead: 1136.0,
..Default::default()
};
- Enable cross-session learning.
Via AgentToolPriors (a ProfileKey), usage frequencies persist across sessions using EWMA (exponentially weighted moving average). At session end, the PersistPriorsHook writes per-tool presence frequencies to the profile store. At next session start, the LoadPriorsHook reads them back and initializes the Beta distribution with learned priors instead of the default 0.01.
This requires a ProfileStore to be configured on the runtime. No additional code is needed beyond the plugin registration — the hooks are wired automatically.
The EWMA smoothing factor is lambda = max(0.1, 1/(n+1)), where n is the session count. Early sessions contribute equally; after 10 sessions the factor stabilizes at 0.1, giving 90% weight to historical data.
Verify
-
Run the agent and trigger an inference. Check logs for the
deferred_tools.listcontext message, which lists all deferred tool names. -
Read
DeferralStatefrom the runtime snapshot to see the current mode of each tool:
use awaken_ext_deferred_tools::state::{DeferralState, DeferralStateValue};
let state: &DeferralStateValue = snapshot.state::<DeferralState>()
.expect("DeferralState not found");
for (tool_id, mode) in &state.modes {
println!("{tool_id}: {mode:?}");
}
-
Ask the LLM a question that requires a deferred tool. Confirm
ToolSearchis called and the tool is promoted to Eager in subsequent turns. -
After several turns of inactivity, verify re-deferral by checking that the tool reverts to
Deferredmode in the snapshot.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| All tools sent to LLM (no deferral) | enabled: Some(false) or total savings below beta_overhead | Set enabled: Some(true) or add more tools so savings exceed overhead |
| Plugin registers but no tools deferred | All rules resolve to Eager | Set default_mode: ToolLoadMode::Deferred or add Deferred rules |
| ToolSearch not available to LLM | Plugin not registered | Register DeferredToolsPlugin with seed tool descriptors |
| Tools never re-deferred | defer_after too high or tool usage is frequent | Lower defer_after or increase thresh_mult |
| Cross-session priors not loading | No ProfileStore configured | Wire a profile store into the runtime |
| ToolSearch returns no results | Tool not in deferred registry | Check that the tool was in the seed_tools list passed to the plugin |
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-deferred-tools/src/lib.rs | Module root and public re-exports |
crates/awaken-ext-deferred-tools/src/config.rs | DeferredToolsConfig, DeferralRule, ToolLoadMode, DiscBetaParams |
crates/awaken-ext-deferred-tools/src/plugin/plugin.rs | DeferredToolsPlugin registration |
crates/awaken-ext-deferred-tools/src/plugin/hooks.rs | Phase hooks (BeforeInference, AfterToolExecute, AfterInference, RunStart, RunEnd) |
crates/awaken-ext-deferred-tools/src/tool_search.rs | ToolSearchTool implementation and query parsing |
crates/awaken-ext-deferred-tools/src/policy.rs | ConfigOnlyPolicy and DiscBetaEvaluator |
crates/awaken-ext-deferred-tools/src/state.rs | State keys: DeferralState, DeferralRegistry, DiscBetaState, ToolUsageStats, AgentToolPriors |
Related
Serve & Integrate
This path is for turning a local runtime into something other systems can call.
Start here
- Expose HTTP SSE to put the runtime behind HTTP and streaming endpoints.
- Integrate AI SDK Frontend for React clients that speak AI SDK v6.
- Integrate CopilotKit (AG-UI) for CopilotKit frontends.
Reference pages to pair with this section
Expose HTTP with SSE
Use this when you need to serve agents over HTTP with Server-Sent Events streaming, supporting multiple protocol adapters (AI SDK, AG-UI, A2A).
Prerequisites
awakencrate with theserverfeature enabledtokiowithrt-multi-threadandsignalfeatures- A built
AgentRuntime
Steps
- Add the dependency.
[dependencies]
awaken = { package = "awaken-agent", version = "...", features = ["server"] }
tokio = { version = "1", features = ["rt-multi-thread", "macros", "signal"] }
- Build the runtime.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::{AgentRuntimeBuilder, AgentSpec, ModelSpec};
use awaken::stores::InMemoryStore;
let store = Arc::new(InMemoryStore::new());
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(
AgentSpec::new("assistant")
.with_model("claude-sonnet")
.with_system_prompt("You are a helpful assistant."),
)
.with_tool("search", Arc::new(SearchTool))
.with_provider("anthropic", Arc::new(GenaiExecutor::new()))
.with_model("claude-sonnet", ModelSpec {
id: "claude-sonnet".into(),
provider: "anthropic".into(),
model: "claude-sonnet-4-20250514".into(),
})
.with_thread_run_store(store.clone())
.build()?;
let runtime = Arc::new(runtime);
- Create the application state.
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::stores::InMemoryMailboxStore;
let mailbox_store = Arc::new(InMemoryMailboxStore::new());
let mailbox = Arc::new(Mailbox::new(
runtime.clone(),
mailbox_store,
"default-consumer".to_string(),
MailboxConfig::default(),
));
let state = AppState::new(
runtime.clone(),
mailbox,
store,
runtime.resolver_arc(),
ServerConfig::default(),
);
ServerConfig::default() binds to 0.0.0.0:3000 with an SSE buffer size of 64.
- Build the router.
use awaken::server::routes::build_router;
let app = build_router().with_state(state);
build_router registers all route groups:
/health– health check/v1/threads– thread CRUD and messaging/v1/runsand/v1/threads/:id/runs– run APIs/v1/config/*and/v1/capabilities– config and capabilities APIs- Protocol adapters: AI SDK v6, AG-UI, A2A, MCP
- Start the server.
let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?;
axum::serve(listener, app).await?;
- Configure SSE buffer size.
let config = ServerConfig {
address: "0.0.0.0:8080".into(),
sse_buffer_size: 128,
..ServerConfig::default()
};
Increase sse_buffer_size if clients consume events slowly and you observe dropped messages.
Verify
curl http://localhost:3000/health
Should return 200 OK. Then create a thread and run:
curl -X POST http://localhost:3000/v1/threads \
-H "Content-Type: application/json" \
-d '{}'
Common Errors
| Error | Cause | Fix |
|---|---|---|
| Address already in use | Port 3000 is occupied | Change the bind address in ServerConfig or TcpListener::bind |
| SSE stream closes immediately | Client does not support text/event-stream | Use curl with --no-buffer or an SSE-compatible client |
| Missing protocol routes | Feature flags not enabled | Ensure server feature is enabled on the awaken crate |
Related Example
crates/awaken-server/tests/run_api.rs – integration tests demonstrating thread creation, run execution, and SSE streaming.
Key Files
crates/awaken-server/src/app.rs–AppState,ServerConfigcrates/awaken-server/src/routes.rs–build_routerand route definitionscrates/awaken-server/src/http_sse.rs– SSE response helperscrates/awaken-server/src/mailbox.rs–Mailboxrun queue
Related
Integrate AI SDK Frontend
Use this when you have a Vercel AI SDK (v6) React frontend and need to connect it to an awaken agent server.
Prerequisites
- A working awaken agent runtime (see First Agent)
- Feature
serverenabled on theawakencrate - Node.js project with
@ai-sdk/reactinstalled
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["server"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
tracing-subscriber = "0.3"
Steps
- Build the backend server.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::contract::storage::ThreadRunStore;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use awaken::AgentRuntimeBuilder;
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::server::routes::build_router;
#[tokio::main]
async fn main() {
tracing_subscriber::fmt().with_target(true).init();
let agent_spec = AgentSpec::new("my-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant.")
.with_max_rounds(10);
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.build()
.expect("failed to build runtime");
let runtime = Arc::new(runtime);
let store = Arc::new(InMemoryStore::new());
let resolver = runtime.resolver_arc();
let mailbox_store = Arc::new(InMemoryMailboxStore::new());
let mailbox = Arc::new(Mailbox::new(
runtime.clone(),
mailbox_store as Arc<dyn awaken::contract::MailboxStore>,
format!("ai-sdk:{}", std::process::id()),
MailboxConfig::default(),
));
let state = AppState::new(
runtime,
mailbox,
store as Arc<dyn ThreadRunStore>,
resolver,
ServerConfig {
address: "127.0.0.1:3000".into(),
..Default::default()
},
);
let app = build_router().with_state(state);
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.expect("failed to bind");
axum::serve(listener, app).await.expect("server crashed");
}
The server automatically registers AI SDK v6 routes at:
POST /v1/ai-sdk/chat– create a new run and stream eventsGET /v1/ai-sdk/chat/:thread_id/stream– resume an existing stream by thread IDGET /v1/ai-sdk/threads/:thread_id/stream– alias for thread-based resumeGET /v1/ai-sdk/threads/:id/messages– retrieve thread messages
-
Connect the React frontend.
Install the AI SDK React package:
npm install ai @ai-sdk/react
Use the useChat hook pointed at your awaken server:
import { useChat } from "@ai-sdk/react";
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat({
api: "http://localhost:3000/v1/ai-sdk/chat",
id: "thread-1",
});
return (
<div>
{messages.map((m) => (
<div key={m.id}>
<strong>{m.role}:</strong> {m.content}
</div>
))}
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
<button type="submit">Send</button>
</form>
</div>
);
}
- Run both sides.
# Terminal 1: backend
cargo run
# Terminal 2: frontend
npm run dev
Verify
- Open the frontend in a browser.
- Send a message.
- Confirm that streaming text appears incrementally.
- Check the backend logs for
RunStartandRunFinishevents.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| CORS error in browser | No CORS middleware | Add tower-http CORS layer to the axum router |
useChat receives no events | Wrong endpoint URL | Confirm the api prop points to /v1/ai-sdk/chat |
stream closed unexpectedly | SSE buffer overflow | Increase sse_buffer_size in ServerConfig |
404 on /v1/ai-sdk/chat | Missing server feature | Enable features = ["server"] in Cargo.toml |
Related Example
examples/ai-sdk-starter/agent/src/main.rs
Key Files
| Path | Purpose |
|---|---|
crates/awaken-server/src/protocols/ai_sdk_v6/http.rs | AI SDK v6 route handlers |
crates/awaken-server/src/protocols/ai_sdk_v6/encoder.rs | AI SDK v6 SSE event encoder |
crates/awaken-server/src/routes.rs | Unified router builder |
crates/awaken-server/src/app.rs | AppState and ServerConfig |
examples/ai-sdk-starter/agent/src/main.rs | Backend entry for the AI SDK starter |
Related
Integrate CopilotKit (AG-UI)
Use this when you have a CopilotKit React frontend and need to connect it to an awaken agent server via the AG-UI protocol.
Prerequisites
- A working awaken agent runtime (see First Agent)
- Feature
serverenabled on theawakencrate - Node.js project with
@copilotkit/react-coreand@copilotkit/react-uiinstalled
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["server"] }
tokio = { version = "1", features = ["full"] }
async-trait = "0.1"
serde_json = "1"
tracing-subscriber = "0.3"
Steps
- Build the backend server.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::contract::storage::ThreadRunStore;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::stores::{InMemoryMailboxStore, InMemoryStore};
use awaken::AgentRuntimeBuilder;
use awaken::server::app::{AppState, ServerConfig};
use awaken::server::mailbox::{Mailbox, MailboxConfig};
use awaken::server::routes::build_router;
#[tokio::main]
async fn main() {
tracing_subscriber::fmt().with_target(true).init();
let agent_spec = AgentSpec::new("copilotkit-agent")
.with_model("gpt-4o-mini")
.with_system_prompt(
"You are a CopilotKit-powered assistant. \
Update shared state and suggest actions when appropriate.",
)
.with_max_rounds(10);
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.build()
.expect("failed to build runtime");
let runtime = Arc::new(runtime);
let store = Arc::new(InMemoryStore::new());
let resolver = runtime.resolver_arc();
let mailbox_store = Arc::new(InMemoryMailboxStore::new());
let mailbox = Arc::new(Mailbox::new(
runtime.clone(),
mailbox_store as Arc<dyn awaken::contract::MailboxStore>,
format!("copilotkit:{}", std::process::id()),
MailboxConfig::default(),
));
let state = AppState::new(
runtime,
mailbox,
store as Arc<dyn ThreadRunStore>,
resolver,
ServerConfig {
address: "127.0.0.1:3000".into(),
..Default::default()
},
);
let app = build_router().with_state(state);
let listener = tokio::net::TcpListener::bind("127.0.0.1:3000")
.await
.expect("failed to bind");
axum::serve(listener, app).await.expect("server crashed");
}
The server automatically registers AG-UI routes at:
POST /v1/ag-ui/run– create a new run and stream AG-UI eventsPOST /v1/ag-ui/threads/:thread_id/runs– start a thread-scoped runPOST /v1/ag-ui/agents/:agent_id/runs– start an agent-scoped runPOST /v1/ag-ui/threads/:thread_id/interrupt– interrupt a running threadGET /v1/ag-ui/threads/:id/messages– retrieve thread messages
-
Connect the CopilotKit frontend.
Install CopilotKit packages:
npm install @copilotkit/react-core @copilotkit/react-ui
Wrap your app with the CopilotKit provider:
import { CopilotKit } from "@copilotkit/react-core";
import { CopilotChat } from "@copilotkit/react-ui";
import "@copilotkit/react-ui/styles.css";
export default function App() {
return (
<CopilotKit runtimeUrl="http://localhost:3000/v1/ag-ui">
<CopilotChat
labels={{ title: "Agent", initial: "How can I help?" }}
/>
</CopilotKit>
);
}
- Run both sides.
# Terminal 1: backend
cargo run
# Terminal 2: frontend
npm run dev
Verify
- Open the frontend in a browser.
- Send a message through the CopilotChat widget.
- Confirm streaming text appears in the chat UI.
- Check the backend logs for
RunStartandRunFinishevents.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| CORS error in browser | No CORS middleware | Add tower-http CORS layer to the axum router |
| CopilotKit shows “connection failed” | Wrong runtimeUrl | Confirm it points to http://localhost:3000/v1/ag-ui |
| Events arrive but UI does not update | AG-UI event format mismatch | Ensure CopilotKit version is compatible with AG-UI protocol |
404 on /v1/ag-ui/run | Missing server feature | Enable features = ["server"] in Cargo.toml |
Related Example
examples/copilotkit-starter/agent/src/main.rs
Key Files
| Path | Purpose |
|---|---|
crates/awaken-server/src/protocols/ag_ui/http.rs | AG-UI route handlers |
crates/awaken-server/src/protocols/ag_ui/encoder.rs | AG-UI SSE event encoder |
crates/awaken-server/src/routes.rs | Unified router builder |
crates/awaken-server/src/app.rs | AppState and ServerConfig |
examples/copilotkit-starter/agent/src/main.rs | Backend entry for the CopilotKit starter |
Related
State & Storage
This path is for teams moving beyond stateless demos.
Use this section to decide
- where thread and run data should live
- how state is keyed and merged
- how much context should reach the model each turn
Recommended order
- Use File Store or Use Postgres Store to choose a persistence backend.
- State Keys and Thread Model to understand state layout and lifecycle.
- Optimize Context Window when context size starts to matter.
Related internals
Use File Store
Use this when you need file-based persistence for threads, runs, and messages without an external database.
Prerequisites
awaken-storescrate with thefilefeature enabled
Steps
- Add the dependency.
[dependencies]
awaken-stores = { version = "...", features = ["file"] }
Or, if using the awaken facade crate (which re-exports awaken-stores), add awaken-stores directly for the feature flag:
[dependencies]
awaken = { package = "awaken-agent", version = "..." }
awaken-stores = { version = "...", features = ["file"] }
- Create a FileStore.
use std::sync::Arc;
use awaken::stores::FileStore;
let store = Arc::new(FileStore::new("./data"));
The directory is created automatically on first write. The layout is:
./data/
threads/<thread_id>.json
messages/<thread_id>.json
runs/<run_id>.json
- Wire it into the runtime.
use awaken::AgentRuntimeBuilder;
let runtime = AgentRuntimeBuilder::new()
.with_thread_run_store(store)
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
- Use an absolute path for production.
use std::path::PathBuf;
let data_dir = PathBuf::from("/var/lib/myapp/awaken");
let store = Arc::new(FileStore::new(data_dir));
Verify
Run the agent, then inspect the data directory. You should see JSON files under threads/, messages/, and runs/ corresponding to the thread and run IDs used.
Common Errors
| Error | Cause | Fix |
|---|---|---|
StorageError::Io | Permission denied on the data directory | Ensure the process has read/write access to the path |
StorageError::Io with empty ID | Thread or run ID contains invalid characters (/, \, ..) | Use simple alphanumeric or UUID-style IDs |
| Missing data after restart | Using a relative path that resolved differently | Use an absolute path |
Related Example
crates/awaken-stores/src/file.rs – FileStore implementation with filesystem layout details.
Key Files
crates/awaken-stores/Cargo.toml– feature flag definitioncrates/awaken-stores/src/file.rs–FileStorecrates/awaken-stores/src/lib.rs– conditional re-export
Related
Use Postgres Store
Use this when you need durable, multi-instance persistence backed by PostgreSQL.
Prerequisites
awaken-storescrate with thepostgresfeature enabled- A running PostgreSQL instance
sqlxruntime dependencies (tokio)
Steps
- Add the dependency.
[dependencies]
awaken-stores = { version = "...", features = ["postgres"] }
- Create a connection pool.
use sqlx::PgPool;
let pool = PgPool::connect("postgres://user:pass@localhost:5432/mydb").await?;
- Create a PostgresStore.
use std::sync::Arc;
use awaken::stores::PostgresStore;
let store = Arc::new(PostgresStore::new(pool));
This uses default table names: awaken_threads, awaken_runs, awaken_messages.
- Use a custom table prefix.
let store = Arc::new(PostgresStore::with_prefix(pool, "myapp"));
This creates tables named myapp_threads, myapp_runs, myapp_messages.
- Wire it into the runtime.
use awaken::AgentRuntimeBuilder;
let runtime = AgentRuntimeBuilder::new()
.with_thread_run_store(store)
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
-
Schema creation.
Tables are auto-created on first access via
ensure_schema(). Each table uses:
id TEXT PRIMARY KEYdata JSONB NOT NULLupdated_at TIMESTAMPTZ NOT NULL DEFAULT now()
No manual migration is required for initial setup.
Verify
After running the agent, query the database:
SELECT id, updated_at FROM awaken_threads;
SELECT id, updated_at FROM awaken_runs;
You should see rows corresponding to the threads and runs created during execution.
Common Errors
| Error | Cause | Fix |
|---|---|---|
sqlx::Error connection refused | PostgreSQL is not running or the connection string is wrong | Verify the DATABASE_URL and that the database is accepting connections |
StorageError on first write | Insufficient database privileges | Grant CREATE TABLE and INSERT permissions to the database user |
| Table name collision | Another application uses the same default table names | Use PostgresStore::with_prefix to namespace tables |
Related Example
crates/awaken-stores/src/postgres.rs – PostgresStore implementation with schema auto-creation.
Key Files
crates/awaken-stores/Cargo.toml–postgresfeature flag andsqlxdependencycrates/awaken-stores/src/postgres.rs–PostgresStorecrates/awaken-stores/src/lib.rs– conditional re-export
Related
Optimize the Context Window
Use this when you need to control how the runtime manages conversation history to stay within model token limits.
Prerequisites
awakencrate added toCargo.toml- An agent configured with
AgentSpec
ContextWindowPolicy
Every agent has a ContextWindowPolicy that controls how conversation history is managed. Set it on your AgentSpec:
use awaken::ContextWindowPolicy;
let policy = ContextWindowPolicy {
max_context_tokens: 200_000,
max_output_tokens: 16_384,
min_recent_messages: 10,
enable_prompt_cache: true,
autocompact_threshold: Some(100_000),
compaction_mode: ContextCompactionMode::KeepRecentRawSuffix,
compaction_raw_suffix_messages: 2,
};
Fields
| Field | Type | Default | Description |
|---|---|---|---|
max_context_tokens | usize | 200_000 | Model’s total context window size in tokens |
max_output_tokens | usize | 16_384 | Tokens reserved for model output |
min_recent_messages | usize | 10 | Minimum number of recent messages to always preserve, even if over budget |
enable_prompt_cache | bool | true | Whether to enable prompt caching |
autocompact_threshold | Option<usize> | None | Token count that triggers auto-compaction. None disables auto-compaction |
compaction_mode | ContextCompactionMode | KeepRecentRawSuffix | Strategy used when auto-compaction fires |
compaction_raw_suffix_messages | usize | 2 | Number of recent raw messages to preserve in suffix compaction mode |
Truncation
When the conversation exceeds the available token budget, the runtime automatically drops the oldest messages to fit. The budget is calculated as:
available = max_context_tokens - max_output_tokens - tool_schema_tokens
What truncation preserves
- System messages are never truncated. All system messages at the start of the history survive regardless of budget.
- Recent messages – at least
min_recent_messageshistory messages are kept, even if they exceed the budget. - Tool call/result pairs – the split point is adjusted so that an assistant message with tool calls is never separated from its corresponding tool result messages.
- Dangling tool calls – after truncation, any orphaned tool calls (whose results were dropped) are patched to prevent invalid message sequences.
Artifact compaction
Before truncation runs, oversized tool results are compacted automatically. A tool result whose text exceeds ARTIFACT_COMPACT_THRESHOLD_TOKENS (2048 tokens, estimated at ~8192 characters) is truncated to a preview of at most 1600 characters or 24 lines, whichever is shorter. The preview includes a compaction indicator showing the original size.
Non-tool messages (system, user, assistant) are never subject to artifact compaction.
Compaction
Compaction summarizes older conversation history into a condensed summary message, reducing token usage while preserving context. Unlike truncation (which drops messages), compaction replaces them with a summary.
Enabling auto-compaction
Set autocompact_threshold to trigger compaction when total message tokens exceed that value:
let policy = ContextWindowPolicy {
autocompact_threshold: Some(100_000),
compaction_mode: ContextCompactionMode::KeepRecentRawSuffix,
compaction_raw_suffix_messages: 4,
..Default::default()
};
ContextCompactionMode
Two strategies are available:
KeepRecentRawSuffix(default) – keeps the most recentcompaction_raw_suffix_messagesmessages as raw history. Everything before the compaction boundary is summarized.CompactToSafeFrontier– compacts all messages up to the safe frontier (the latest point where all tool call/result pairs are complete).
The compaction boundary is chosen so that no tool call is separated from its result. The boundary finder walks the message history and only places boundaries where all open tool calls have been resolved.
CompactionConfig
The compaction subsystem is configured through CompactionConfig, stored in the agent spec’s sections["compaction"] and read via CompactionConfigKey:
use awaken::CompactionConfig;
let config = CompactionConfig {
summarizer_system_prompt: "You are a conversation summarizer. \
Preserve all key facts, decisions, tool results, and action items. \
Be concise but complete.".into(),
summarizer_user_prompt: "Summarize the following conversation:\n\n{messages}".into(),
summary_max_tokens: Some(1024),
summary_model: Some("claude-3-haiku".into()),
min_savings_ratio: 0.3,
};
| Field | Type | Default | Description |
|---|---|---|---|
summarizer_system_prompt | String | Conversation summarizer prompt | System prompt for the summarizer LLM call |
summarizer_user_prompt | String | "Summarize...\n\n{messages}" | User prompt template; {messages} is replaced with the conversation transcript |
summary_max_tokens | Option<u32> | None | Maximum tokens for the summary response |
summary_model | Option<String> | None | Model for summarization (defaults to the agent’s model) |
min_savings_ratio | f64 | 0.3 | Minimum token savings ratio (0.0-1.0) to accept a compaction |
The compaction pass only runs when the expected savings ratio exceeds min_savings_ratio. A minimum gain of 1024 tokens (MIN_COMPACTION_GAIN_TOKENS) is also required to justify the summarization LLM call.
DefaultSummarizer
The built-in DefaultSummarizer reads prompts from CompactionConfig and supports cumulative summarization. When a previous summary exists, it asks the LLM to update the existing summary with new conversation rather than re-summarizing everything from scratch.
The transcript renderer filters out Visibility::Internal messages before sending to the summarizer, since system-injected context is re-injected each turn and should not be included in summaries.
Summary storage
Compaction summaries are stored as <conversation-summary> tagged internal system messages. On load, trim_to_compaction_boundary drops all messages before the latest summary message, so already-summarized history is never re-loaded into the context window.
Compaction boundaries are tracked durably via CompactionState, recording the summary text, pre/post token counts, and timestamp for each compaction event.
Truncation recovery
When the LLM stops due to MaxTokens with incomplete tool calls (argument JSON was truncated mid-generation), the runtime can automatically retry by injecting a continuation prompt asking the model to break its work into smaller pieces and continue. The retry count is tracked by TruncationState and bounded by a configurable maximum.
Key Files
crates/awaken-contract/src/contract/inference.rs–ContextWindowPolicy,ContextCompactionModecrates/awaken-runtime/src/context/transform/mod.rs–ContextTransform(truncation)crates/awaken-runtime/src/context/transform/compaction.rs– artifact compactioncrates/awaken-runtime/src/context/compaction.rs– boundary finding, load-time trimmingcrates/awaken-runtime/src/context/summarizer.rs–ContextSummarizer,DefaultSummarizercrates/awaken-runtime/src/context/plugin.rs–CompactionPlugin,CompactionConfig,CompactionStatecrates/awaken-runtime/src/context/truncation.rs–TruncationState, continuation prompts
Related
State Keys
The state system provides typed, scoped, persistent key-value storage for agent runs. Plugins and tools define state keys at compile time; the runtime manages snapshots, persistence, and parallel merge semantics.
StateKey trait
Every state slot is identified by a type implementing StateKey.
pub trait StateKey: 'static + Send + Sync {
/// Unique string identifier for serialization.
const KEY: &'static str;
/// Parallel merge strategy. Default: `Exclusive`.
const MERGE: MergeStrategy = MergeStrategy::Exclusive;
/// Lifetime scope. Default: `Run`.
const SCOPE: KeyScope = KeyScope::Run;
/// The stored value type.
type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
/// The update command type.
type Update: Send + 'static;
/// Apply an update to the current value.
fn apply(value: &mut Self::Value, update: Self::Update);
/// Serialize value to JSON. Default uses serde_json.
fn encode(value: &Self::Value) -> Result<JsonValue, StateError>;
/// Deserialize value from JSON. Default uses serde_json.
fn decode(value: JsonValue) -> Result<Self::Value, StateError>;
}
Crate path: awaken::state::StateKey (re-exported at awaken::StateKey)
Example
struct Counter;
impl StateKey for Counter {
const KEY: &'static str = "counter";
type Value = usize;
type Update = usize;
fn apply(value: &mut Self::Value, update: Self::Update) {
*value += update;
}
}
KeyScope
Controls when a key’s value is cleared relative to run boundaries.
pub enum KeyScope {
/// Cleared at run start (default).
Run,
/// Persists across runs on the same thread.
Thread,
}
MergeStrategy
Determines how concurrent updates to the same key are handled when merging
MutationBatches from parallel tool execution.
pub enum MergeStrategy {
/// Concurrent writes to this key conflict. Parallel batches that both
/// touch this key cannot be merged.
Exclusive,
/// Updates are commutative -- they can be applied in any order. Parallel
/// batches that both touch this key will have their ops concatenated.
Commutative,
}
StateMap
A type-erased container for state values. Backed by TypedMap from the
typedmap crate.
pub struct StateMap { /* ... */ }
Methods
/// Check if a key is present.
fn contains<K: StateKey>(&self) -> bool
/// Get a reference to the value.
fn get<K: StateKey>(&self) -> Option<&K::Value>
/// Get a mutable reference to the value.
fn get_mut<K: StateKey>(&mut self) -> Option<&mut K::Value>
/// Insert a value, replacing any existing one.
fn insert<K: StateKey>(&mut self, value: K::Value)
/// Remove and return the value.
fn remove<K: StateKey>(&mut self) -> Option<K::Value>
/// Get a mutable reference, inserting `Default::default()` if absent.
fn get_or_insert_default<K: StateKey>(&mut self) -> &mut K::Value
StateMap implements Clone and Default.
Snapshot
An immutable, versioned view of the state map. Passed to tools via
ToolCallContext.
pub struct Snapshot {
pub revision: u64,
pub ext: Arc<StateMap>,
}
Methods
fn new(revision: u64, ext: Arc<StateMap>) -> Self
fn revision(&self) -> u64
fn get<K: StateKey>(&self) -> Option<&K::Value>
fn ext(&self) -> &StateMap
Type alias: Snapshot is not a type alias; it is a concrete struct that
wraps Arc<StateMap>.
StateKeyOptions
Declarative options used when registering a state key with the runtime.
pub struct StateKeyOptions {
/// Whether the key is persisted to the store. Default: `true`.
pub persistent: bool,
/// Whether the key survives plugin uninstall. Default: `false`.
pub retain_on_uninstall: bool,
/// Lifetime scope. Default: `KeyScope::Run`.
pub scope: KeyScope,
}
PersistedState
Serialized form of the state map used by storage backends.
pub struct PersistedState {
pub revision: u64,
pub extensions: HashMap<String, JsonValue>,
}
Shared State (ProfileKey + StateScope)
For cross-boundary persistent state with dynamic scoping. Shared state uses ProfileKey –
the same trait used for profile data – combined with a key: &str parameter for the runtime
key dimension. Unlike StateKey, shared state is accessed asynchronously through ProfileAccess
and does not participate in the snapshot/mutation workflow.
pub trait ProfileKey: 'static + Send + Sync {
/// Namespace identifier (used as the storage namespace).
const KEY: &'static str;
/// The value type stored under this key.
type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
/// Serialize value to JSON.
fn encode(value: &Self::Value) -> Result<JsonValue, StateError>;
/// Deserialize value from JSON.
fn decode(value: JsonValue) -> Result<Self::Value, StateError>;
}
The two dimensions for both shared and profile state are:
- Namespace (
ProfileKey::KEY) – compile-time, binds to aValuetype - Key (
&strparameter) – runtime, identifies which instance
For shared state, the key is typically a StateScope string; for profile state, it is an agent name or "system".
Crate path: awaken_contract::ProfileKey (re-exported from awaken_contract::contract::profile_store)
StateScope
pub struct StateScope(String);
Optional convenience builder for common key string patterns. Constructors:
| Method | Produced String | Use Case |
|---|---|---|
StateScope::global() | "global" | System-wide shared state |
StateScope::parent_thread(id) | "parent_thread::{id}" | Parent-child agent sharing |
StateScope::agent_type(name) | "agent_type::{name}" | Per-agent-type sharing |
StateScope::thread(id) | "thread::{id}" | Thread-local persistent state |
StateScope::new(s) | "{s}" | Arbitrary grouping |
Call .as_str() to get the key string. Users can also pass any raw &str directly.
ProfileAccess Methods
ProfileAccess methods take key: &str for the runtime key dimension. The same methods
serve both shared and profile state:
impl ProfileAccess {
async fn read<K: ProfileKey>(&self, key: &str) -> Result<K::Value, StorageError>;
async fn write<K: ProfileKey>(&self, key: &str, value: &K::Value) -> Result<(), StorageError>;
async fn delete<K: ProfileKey>(&self, key: &str) -> Result<(), StorageError>;
}
// Shared state usage:
let scope = StateScope::global();
let value = access.read::<MyKey>(scope.as_str()).await?;
access.write::<MyKey>(scope.as_str(), &value).await?;
// Profile state usage:
let locale = access.read::<Locale>("alice").await?;
access.write::<Locale>("system", &"en-US".into()).await?;
Registration
impl PluginRegistrar {
/// Register a profile key (used for both profile state and shared state).
pub fn register_profile_key<K: ProfileKey>(&mut self) -> Result<(), StateError>;
}
Related
Thread Model
Threads represent persistent conversations. A thread holds metadata only;
messages are stored and loaded separately through the ThreadStore trait.
Thread
pub struct Thread {
/// Unique thread identifier (UUID v7).
pub id: String,
/// Thread metadata.
pub metadata: ThreadMetadata,
}
Crate path: awaken::contract::thread::Thread (re-exported from awaken-contract)
Constructors
/// Create with a generated UUID v7 identifier.
fn new() -> Self
/// Create with a specific identifier.
fn with_id(id: impl Into<String>) -> Self
Builder methods
fn with_title(self, title: impl Into<String>) -> Self
Thread implements Default (delegates to Thread::new()), Clone,
Serialize, and Deserialize.
ThreadMetadata
pub struct ThreadMetadata {
/// Creation timestamp (unix millis).
pub created_at: Option<u64>,
/// Last update timestamp (unix millis).
pub updated_at: Option<u64>,
/// Optional thread title.
pub title: Option<String>,
/// Custom metadata key-value pairs.
pub custom: HashMap<String, Value>,
}
All Option fields are omitted from JSON when None. The custom map is
omitted when empty.
ThreadMetadata implements Default, Clone, Serialize, and Deserialize.
Storage
Messages are not stored inside the Thread struct. They are managed through
the ThreadStore trait:
#[async_trait]
pub trait ThreadStore: Send + Sync {
async fn load_thread(&self, thread_id: &str) -> Result<Option<Thread>, StorageError>;
async fn save_thread(&self, thread: &Thread) -> Result<(), StorageError>;
async fn delete_thread(&self, thread_id: &str) -> Result<(), StorageError>;
async fn list_threads(&self, offset: usize, limit: usize) -> Result<Vec<String>, StorageError>;
async fn load_messages(&self, thread_id: &str) -> Result<Option<Vec<Message>>, StorageError>;
async fn save_messages(&self, thread_id: &str, messages: &[Message]) -> Result<(), StorageError>;
async fn delete_messages(&self, thread_id: &str) -> Result<(), StorageError>;
async fn update_thread_metadata(&self, id: &str, metadata: ThreadMetadata) -> Result<(), StorageError>;
}
ThreadRunStore
Extends ThreadStore + RunStore with an atomic checkpoint operation.
#[async_trait]
pub trait ThreadRunStore: ThreadStore + RunStore + Send + Sync {
/// Persist thread messages and run record atomically.
async fn checkpoint(
&self,
thread_id: &str,
messages: &[Message],
run: &RunRecord,
) -> Result<(), StorageError>;
}
RunStore
Run record persistence for tracking run history and enabling resume.
#[async_trait]
pub trait RunStore: Send + Sync {
async fn create_run(&self, record: &RunRecord) -> Result<(), StorageError>;
async fn load_run(&self, run_id: &str) -> Result<Option<RunRecord>, StorageError>;
async fn latest_run(&self, thread_id: &str) -> Result<Option<RunRecord>, StorageError>;
async fn list_runs(&self, query: &RunQuery) -> Result<RunPage, StorageError>;
}
RunRecord
pub struct RunRecord {
pub run_id: String,
pub thread_id: String,
pub agent_id: String,
pub parent_run_id: Option<String>,
pub status: RunStatus,
pub termination_code: Option<String>,
pub created_at: u64, // unix seconds
pub updated_at: u64, // unix seconds
pub steps: usize,
pub input_tokens: u64,
pub output_tokens: u64,
pub state: Option<PersistedState>,
}
Related
Operate
This path is for hardening an agent service once the happy path already works.
Recommended order
- Enable Observability to make runs, tools, and providers visible.
- Enable Tool Permission HITL to add approval control over tool execution.
- Configure Stop Policies to keep agent loops bounded and predictable.
- Report Tool Progress and Testing Strategy to improve operator visibility and confidence.
Keep nearby
Enable Tool Permission HITL
Use this when you need to control which tools an agent can invoke, with human-in-the-loop approval for sensitive operations.
Prerequisites
- A working awaken agent runtime (see First Agent)
- Feature
permissionenabled on theawakencrate (enabled by default)
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["permission"] }
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Steps
- Register the permission plugin.
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_permission::PermissionPlugin;
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
let agent_spec = AgentSpec::new("my-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("permission");
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin("permission", Arc::new(PermissionPlugin) as Arc<dyn Plugin>)
.build()
.expect("failed to build runtime");
The plugin registers a BeforeToolExecute phase hook that evaluates permission rules before every tool call.
- Define permission rules inline.
use awaken::ext_permission::{PermissionRulesConfig, PermissionRuleEntry, ToolPermissionBehavior};
let config = PermissionRulesConfig {
default_behavior: ToolPermissionBehavior::Ask,
rules: vec![
PermissionRuleEntry {
tool: "read_file".into(),
behavior: ToolPermissionBehavior::Allow,
scope: Default::default(),
},
PermissionRuleEntry {
tool: "file_*".into(),
behavior: ToolPermissionBehavior::Ask,
scope: Default::default(),
},
PermissionRuleEntry {
tool: "delete_*".into(),
behavior: ToolPermissionBehavior::Deny,
scope: Default::default(),
},
],
};
let ruleset = config.into_ruleset().expect("invalid rules");
-
Load rules from a YAML file (alternative).
Create
permissions.yaml:
default_behavior: ask
rules:
- tool: "read_file"
behavior: allow
- tool: "Bash(npm *)"
behavior: allow
- tool: "file_*"
behavior: ask
- tool: "delete_*"
behavior: deny
- tool: "mcp__*"
behavior: ask
Load it in code:
use awaken::ext_permission::PermissionRulesConfig;
let config = PermissionRulesConfig::from_file("permissions.yaml")
.expect("failed to load permissions");
let ruleset = config.into_ruleset().expect("invalid rules");
-
Configure via agent spec.
Permission rules can also be embedded in the agent spec using the
permissionplugin config key:
use awaken::registry_spec::AgentSpec;
let agent_spec = AgentSpec::new("my-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("permission");
-
Understand rule evaluation.
Rules are evaluated with firewall-like priority:
-
Deny – highest priority, blocks the tool call immediately
-
Allow – permits the tool call without user interaction
-
Ask – suspends the tool call and waits for human approval
The pattern syntax supports:
| Pattern | Matches |
|---|---|
read_file | Exact tool name |
file_* | Glob on tool name |
mcp__github__* | Glob for MCP-prefixed tools |
Bash(npm *) | Tool name with primary argument glob |
Edit(file_path ~ "src/**") | Named field glob |
Bash(command =~ "(?i)rm") | Named field regex |
/mcp__(gh|gl)__.*/ | Regex tool name |
Verify
- Register a tool that matches a
denyrule and attempt to invoke it. - The tool call should be blocked and a
ToolInterceptActionevent emitted. - Register a tool matching an
askrule. The run should suspend, waiting for human approval via the mailbox. - Send approval through the mailbox endpoint and confirm the run resumes.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| All tools blocked | default_behavior: deny with no allow rules | Add explicit allow rules for safe tools |
| Rules not evaluated | Plugin not registered | Register PermissionPlugin and add "permission" to plugin_ids in agent spec |
| Invalid pattern error | Malformed glob or regex | Check syntax against the pattern table above |
| Ask rule never resolves | No mailbox consumer | Wire up a frontend or API client to respond to mailbox items |
Related Example
crates/awaken-ext-permission/tests/
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-permission/src/lib.rs | Module root and public re-exports |
crates/awaken-ext-permission/src/config.rs | PermissionRulesConfig and YAML/JSON loading |
crates/awaken-ext-permission/src/rules.rs | Pattern syntax, ToolPermissionBehavior, rule evaluation |
crates/awaken-ext-permission/src/plugin/plugin.rs | PermissionPlugin registration |
crates/awaken-ext-permission/src/plugin/checker.rs | PermissionInterceptHook (BeforeToolExecute) |
crates/awaken-tool-pattern/ | Shared glob/regex pattern matching library |
Related
Configure Stop Policies
Use this when you need to control when an agent run terminates based on round count, token usage, elapsed time, or error frequency.
Prerequisites
awakencrate added toCargo.toml- Familiarity with
PluginandAgentRuntimeBuilder
Overview
Stop policies evaluate after each inference step and decide whether the run should continue or terminate. The system provides four built-in policies and a trait for custom implementations.
Built-in policies:
| Policy | Triggers when |
|---|---|
MaxRoundsPolicy | Step count exceeds max rounds |
TokenBudgetPolicy | Total tokens (input + output) exceed max_total |
TimeoutPolicy | Elapsed wall time exceeds max_ms milliseconds |
ConsecutiveErrorsPolicy | Consecutive inference errors reach max |
Steps
- Create policies programmatically.
use std::sync::Arc;
use awaken::policies::{
MaxRoundsPolicy, TokenBudgetPolicy, TimeoutPolicy,
ConsecutiveErrorsPolicy, StopPolicy,
};
let policies: Vec<Arc<dyn StopPolicy>> = vec![
Arc::new(MaxRoundsPolicy::new(25)),
Arc::new(TokenBudgetPolicy::new(100_000)),
Arc::new(TimeoutPolicy::new(300_000)), // 5 minutes in ms
Arc::new(ConsecutiveErrorsPolicy::new(3)),
];
- Register a
StopConditionPluginwith the runtime builder.
use awaken::policies::StopConditionPlugin;
use awaken::AgentRuntimeBuilder;
let runtime = AgentRuntimeBuilder::new()
.with_plugin("stop-condition", Arc::new(StopConditionPlugin::new(policies)))
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
For the common case of limiting rounds only, use the convenience wrapper:
use awaken::policies::MaxRoundsPlugin;
let runtime = AgentRuntimeBuilder::new()
.with_plugin("stop-condition:max-rounds", Arc::new(MaxRoundsPlugin::new(10)))
.with_agent_spec(spec)
.with_provider("anthropic", Arc::new(provider))
.build()?;
- Use declarative
StopConditionSpecvalues.
The policies_from_specs function converts declarative specs into policy instances. This is useful when loading configuration from JSON or YAML.
use awaken_contract::contract::lifecycle::StopConditionSpec;
use awaken::policies::{policies_from_specs, StopConditionPlugin};
let specs = vec![
StopConditionSpec::MaxRounds { rounds: 10 },
StopConditionSpec::Timeout { seconds: 300 },
StopConditionSpec::TokenBudget { max_total: 100_000 },
StopConditionSpec::ConsecutiveErrors { max: 3 },
];
let policies = policies_from_specs(&specs);
let plugin = StopConditionPlugin::new(policies);
The full set of StopConditionSpec variants:
pub enum StopConditionSpec {
MaxRounds { rounds: usize },
Timeout { seconds: u64 },
TokenBudget { max_total: usize },
ConsecutiveErrors { max: usize },
StopOnTool { tool_name: String }, // not yet implemented
ContentMatch { pattern: String }, // not yet implemented
LoopDetection { window: usize }, // not yet implemented
}
StopOnTool, ContentMatch, and LoopDetection are defined in the contract but not yet backed by policy implementations. policies_from_specs silently skips unimplemented variants.
The StopPolicy Trait
Implement StopPolicy to create custom stop conditions:
use awaken::policies::{StopPolicy, StopDecision, StopPolicyStats};
pub struct MyCustomPolicy {
pub threshold: u64,
}
impl StopPolicy for MyCustomPolicy {
fn id(&self) -> &str {
"my_custom"
}
fn evaluate(&self, stats: &StopPolicyStats) -> StopDecision {
if stats.total_output_tokens > self.threshold {
StopDecision::Stop {
code: "my_custom".into(),
detail: format!("output tokens {} exceeded {}", stats.total_output_tokens, self.threshold),
}
} else {
StopDecision::Continue
}
}
}
The trait requires Send + Sync + 'static. Evaluation must be synchronous – it is pure computation on the provided stats.
StopPolicyStats
Every policy receives a StopPolicyStats snapshot with fields populated by the internal StopConditionHook:
| Field | Type | Description |
|---|---|---|
step_count | u32 | Number of inference steps completed so far |
total_input_tokens | u64 | Cumulative prompt tokens across all steps |
total_output_tokens | u64 | Cumulative completion tokens across all steps |
elapsed_ms | u64 | Wall time since the first step, in milliseconds |
consecutive_errors | u32 | Current streak of consecutive inference errors (resets on success) |
last_tool_names | Vec<String> | Tool names called in the most recent inference response |
last_response_text | String | Text content of the most recent inference response |
StopDecision
pub enum StopDecision {
Continue,
Stop { code: String, detail: String },
}
When any policy returns StopDecision::Stop, the hook converts it to TerminationReason::Stopped with the given code and detail, then updates the run lifecycle to Done. The agent loop exits after the current step. Policies are evaluated in order; the first Stop wins.
How Stop Policies Interact with the Agent Loop
- The
StopConditionPluginregisters aPhaseHookonPhase::AfterInference. - After each LLM inference, the hook increments
step_count, accumulates token usage, and tracks consecutive errors inStopConditionStatsState. - The hook builds a
StopPolicyStatssnapshot and callsevaluateon each registered policy. - If any policy returns
Stop, the hook emits aRunLifecycleUpdate::Donestate command with the stop code, which terminates the run. - If all policies return
Continue, the agent loop proceeds to the next step.
A policy with max or max_total set to 0 is treated as disabled and always returns Continue.
Common Errors
| Error | Cause | Fix |
|---|---|---|
| Run never stops | No stop policy registered and LLM keeps calling tools | Register at least MaxRoundsPolicy or MaxRoundsPlugin |
StateError::KeyAlreadyRegistered | Both StopConditionPlugin and MaxRoundsPlugin registered | Use only one; they share the same state key |
| Timeout fires too early | TimeoutPolicy takes milliseconds, StopConditionSpec::Timeout takes seconds | When using TimeoutPolicy::new() directly, pass milliseconds |
Key Files
crates/awaken-runtime/src/policies/mod.rs– module root and public exportscrates/awaken-runtime/src/policies/policy.rs–StopPolicytrait, built-in policies,policies_from_specscrates/awaken-runtime/src/policies/plugin.rs–StopConditionPluginandMaxRoundsPlugincrates/awaken-runtime/src/policies/state.rs–StopConditionStatsStateand its state keycrates/awaken-runtime/src/policies/hook.rs– internalStopConditionHookthat drives evaluationcrates/awaken-contract/src/contract/lifecycle.rs–StopConditionSpecenum
Related
Enable Observability
Use this when you need to trace LLM inference calls and tool executions with OpenTelemetry-compatible telemetry.
Prerequisites
- A working awaken agent runtime (see First Agent)
- Feature
observabilityenabled on theawakencrate (enabled by default) - For OTel export: feature
otelenabled onawaken-ext-observability, plus a configured OTel collector
[dependencies]
awaken = { package = "awaken-agent", version = "0.1", features = ["observability"] }
tokio = { version = "1", features = ["full"] }
Steps
- Register with the in-memory sink (development).
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, InMemorySink};
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
let sink = InMemorySink::new();
let obs_plugin = ObservabilityPlugin::new(sink.clone());
let agent_spec = AgentSpec::new("observed-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("observability");
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
.build()
.expect("failed to build runtime");
After a run completes, inspect collected metrics:
let metrics = sink.metrics();
println!("inferences: {}", metrics.inferences.len());
println!("tool calls: {}", metrics.tools.len());
println!("total input tokens: {}", metrics.total_input_tokens());
println!("total output tokens: {}", metrics.total_output_tokens());
println!("tool failures: {}", metrics.tool_failures());
println!("session duration: {}ms", metrics.session_duration_ms);
for stat in metrics.stats_by_model() {
println!("{}: {} calls, {} in / {} out tokens",
stat.model, stat.inference_count, stat.input_tokens, stat.output_tokens);
}
for stat in metrics.stats_by_tool() {
println!("{}: {} calls, {} failures",
stat.name, stat.call_count, stat.failure_count);
}
- Register with the OTel sink (production).
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, OtelMetricsSink};
use awaken::registry_spec::{AgentSpec, ModelSpec};
use awaken::{AgentRuntimeBuilder, Plugin};
use opentelemetry_sdk::trace::SdkTracerProvider;
let provider = SdkTracerProvider::builder()
// configure your exporter (OTLP, Jaeger, etc.)
.build();
let tracer = provider.tracer("awaken");
let obs_plugin = ObservabilityPlugin::new(OtelMetricsSink::new(tracer));
let agent_spec = AgentSpec::new("observed-agent")
.with_model("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("observability");
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(
"gpt-4o-mini",
ModelSpec {
id: "gpt-4o-mini".into(),
provider: "openai".into(),
model: "gpt-4o-mini".into(),
},
)
.with_agent_spec(agent_spec)
.with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
.build()
.expect("failed to build runtime");
- Implement a custom sink (optional).
use awaken::ext_observability::{MetricsSink, GenAISpan, ToolSpan, AgentMetrics};
struct MySink;
impl MetricsSink for MySink {
fn on_inference(&self, span: &GenAISpan) {
// forward to your metrics system
}
fn on_tool(&self, span: &ToolSpan) {
// forward to your metrics system
}
fn on_run_end(&self, metrics: &AgentMetrics) {
// emit summary metrics
}
}
-
Captured telemetry.
The plugin hooks into the following phases:
| Phase | Data Captured |
|---|---|
RunStart | Session start timestamp |
BeforeInference | Inference start timestamp, model, provider |
AfterInference | Token usage, finish reasons, duration, cache tokens |
BeforeToolExecute | Tool call start timestamp |
AfterToolExecute | Tool duration, error status |
RunEnd | Session duration |
OTel spans follow GenAI semantic conventions with attributes such as gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, and gen_ai.usage.output_tokens.
Verify
- Run an agent with the
InMemorySink. - After
run()completes, callsink.metrics(). - Confirm
inferencesis non-empty and token counts are populated. - For OTel, check your collector or Jaeger UI for spans named with the
awakentracer.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| Metrics are all zero | Plugin not registered | Register ObservabilityPlugin with the runtime builder |
OtelMetricsSink not found | Missing otel feature | Enable the otel feature on awaken-ext-observability |
| No spans in collector | Exporter not configured | Verify SdkTracerProvider has an exporter and is not dropped |
| Token counts missing | LLM provider does not report usage | Check that your LlmExecutor returns TokenUsage in LLMResponse |
Related Example
crates/awaken-ext-observability/tests/
Key Files
| Path | Purpose |
|---|---|
crates/awaken-ext-observability/src/lib.rs | Module root and public re-exports |
crates/awaken-ext-observability/src/plugin/plugin.rs | ObservabilityPlugin registration |
crates/awaken-ext-observability/src/plugin/hooks.rs | Phase hooks for each telemetry point |
crates/awaken-ext-observability/src/metrics.rs | AgentMetrics, GenAISpan, ToolSpan types |
crates/awaken-ext-observability/src/sink.rs | MetricsSink trait and InMemorySink |
crates/awaken-ext-observability/src/otel.rs | OtelMetricsSink with GenAI semantic conventions |
Related
Report Tool Progress
Use this when you need to stream progress updates or activity snapshots from a tool back to the frontend during execution.
Prerequisites
- A
Toolimplementation with access toToolCallContext awakencrate added toCargo.toml
Steps
-
Report structured progress with
report_progress.Call
report_progressinside yourTool::executemethod to emit aToolCallProgressStatesnapshot. The runtime wraps it in anAgentEvent::ActivitySnapshotwithactivity_type = "tool-call-progress".
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use awaken::contract::progress::ProgressStatus;
use async_trait::async_trait;
use serde_json::{Value, json};
struct IndexTool;
#[async_trait]
impl Tool for IndexTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("index_docs", "Index Docs", "Index a document set")
}
async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
ctx.report_progress(ProgressStatus::Running, Some("Starting indexing"), None).await;
for i in 0..10 {
// ... do work ...
let fraction = (i + 1) as f64 / 10.0;
ctx.report_progress(
ProgressStatus::Running,
Some(&format!("Indexed batch {}/10", i + 1)),
Some(fraction),
).await;
}
ctx.report_progress(ProgressStatus::Done, Some("Indexing complete"), Some(1.0)).await;
Ok(ToolResult::success("index_docs", json!({"indexed": 10})).into())
}
}
The progress parameter is a normalized f64 between 0.0 and 1.0.
Pass None when progress is indeterminate.
-
Report custom activity snapshots with
report_activity.Use
report_activityfor full-state activity updates that are not structured progress. The runtime emits anAgentEvent::ActivitySnapshotwithreplace: Some(true).
ctx.report_activity("code-generation", "fn hello() {\n println!(\"hi\");\n}").await;
The activity_type string identifies the kind of activity. Frontends use
it to choose how to render the content.
-
Report incremental activity deltas with
report_activity_delta.Use
report_activity_deltato send a JSON patch instead of replacing the full content. The runtime emits anAgentEvent::ActivityDelta. If you pass a single JSON value, it is wrapped in an array automatically.
use serde_json::json;
ctx.report_activity_delta(
"code-generation",
json!([
{ "op": "add", "path": "/line", "value": " println!(\"world\");" }
]),
).await;
- Use
ProgressStatusvariants to reflect the tool call lifecycle.
| Variant | Meaning |
|---|---|
Pending | Tool call is queued but has not started |
Running | Tool call is actively executing |
Done | Tool call completed successfully |
Failed | Tool call encountered an error |
Cancelled | Tool call was cancelled before completion |
ProgressStatus serializes to snake_case strings ("pending", "running",
"done", "failed", "cancelled").
How it works
report_progress builds a ToolCallProgressState and emits it as an
AgentEvent::ActivitySnapshot:
pub struct ToolCallProgressState {
pub schema: String, // "tool-call-progress.v1"
pub node_id: String, // set to call_id
pub call_id: String,
pub tool_name: String,
pub status: ProgressStatus,
pub progress: Option<f64>, // 0.0 - 1.0, None if indeterminate
pub loaded: Option<u64>, // absolute loaded count
pub total: Option<u64>, // absolute total count
pub message: Option<String>,
pub parent_node_id: Option<String>,
pub parent_call_id: Option<String>,
}
The activity type is the constant TOOL_CALL_PROGRESS_ACTIVITY_TYPE, which
equals "tool-call-progress". Optional fields (progress, loaded, total,
message, parent_node_id, parent_call_id) are omitted from JSON when None.
All three reporting methods are no-ops when the context has no activity_sink
configured.
Verify
Subscribe to the SSE event stream and confirm that ActivitySnapshot events
with activity_type = "tool-call-progress" appear while the tool executes,
and that the status field transitions from "running" to "done".
Common Errors
| Error | Cause | Fix |
|---|---|---|
| No events appear | activity_sink is None on the context | Ensure the runtime is configured with an event sink |
| Progress not updating in frontend | Frontend does not handle ActivitySnapshot with this activity type | Filter on activity_type == "tool-call-progress" |
Key Files
crates/awaken-contract/src/contract/progress.rs–ToolCallProgressState,ProgressStatus,TOOL_CALL_PROGRESS_ACTIVITY_TYPEcrates/awaken-contract/src/contract/tool.rs–ToolCallContext::report_progress,report_activity,report_activity_delta
Related
Testing Strategy
Use this when you need to test tools, plugins, state keys, or full agent runs without depending on a live LLM.
Prerequisites
awakencrate added toCargo.toml(with the runtime re-exports)tokiowithrtandmacrosfeatures for async testsserde_jsonfor constructing tool arguments and assertions
1. Unit testing a Tool
Create a ToolCallContext with a test snapshot, call tool.execute(), and assert on the returned ToolOutput.
use async_trait::async_trait;
use serde_json::{Value, json};
use awaken::contract::tool::{
Tool, ToolCallContext, ToolDescriptor, ToolError, ToolOutput, ToolResult,
};
struct GreetTool;
#[async_trait]
impl Tool for GreetTool {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("greet", "greet", "Greet a user by name")
}
async fn execute(&self, args: Value, _ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
let name = args["name"]
.as_str()
.ok_or_else(|| ToolError::InvalidArguments("missing 'name'".into()))?;
Ok(ToolResult::success("greet", json!({ "greeting": format!("Hello, {name}!") })).into())
}
}
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn greet_tool_returns_greeting() {
let tool = GreetTool;
let ctx = ToolCallContext::test_default();
let output = tool.execute(json!({"name": "Alice"}), &ctx).await.unwrap();
assert!(output.result.is_success());
assert_eq!(output.result.data["greeting"], "Hello, Alice!");
}
#[tokio::test]
async fn greet_tool_rejects_missing_name() {
let tool = GreetTool;
let ctx = ToolCallContext::test_default();
let err = tool.execute(json!({}), &ctx).await.unwrap_err();
assert!(matches!(err, ToolError::InvalidArguments(_)));
}
}
When your tool returns side-effects via ToolOutput::with_command(), assert on the command field:
#[tokio::test]
async fn tool_produces_state_command() {
let tool = CounterMutationTool { increment: 5 };
let ctx = ToolCallContext::test_default();
let output = tool.execute(json!({}), &ctx).await.unwrap();
assert!(output.result.is_success());
// The command is opaque at this level; integration tests (section 4)
// verify that commands are applied correctly to the StateStore.
assert!(!output.command.is_empty());
}
2. Unit testing a Plugin
Verify that a plugin registers the expected state keys and hooks by creating a PluginRegistrar and calling plugin.register().
use awaken::contract::StateError;
use awaken::state::{StateKey, MergeStrategy, StateKeyOptions, StateCommand};
use awaken::plugins::{Plugin, PluginDescriptor, PluginRegistrar};
use serde::{Serialize, Deserialize};
// -- State key --
struct Counter;
impl StateKey for Counter {
const KEY: &'static str = "test.counter";
const MERGE: MergeStrategy = MergeStrategy::Commutative;
type Value = usize;
type Update = usize;
fn apply(value: &mut Self::Value, update: Self::Update) {
*value += update;
}
}
// -- Plugin --
struct CounterPlugin;
impl Plugin for CounterPlugin {
fn descriptor(&self) -> PluginDescriptor {
PluginDescriptor { name: "counter" }
}
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_key::<Counter>(StateKeyOptions::default())?;
Ok(())
}
}
#[cfg(test)]
mod tests {
use super::*;
use awaken::state::StateStore;
#[test]
fn counter_plugin_registers_key() {
let store = StateStore::new();
// install_plugin calls register() internally
store.install_plugin(CounterPlugin).unwrap();
// The key should now be readable (returns the default value)
let val = store.read::<Counter>().unwrap_or_default();
assert_eq!(val, 0);
}
#[test]
fn counter_plugin_rejects_double_registration() {
let store = StateStore::new();
store.install_plugin(CounterPlugin).unwrap();
// Registering the same key again should fail
let result = store.install_plugin(CounterPlugin);
assert!(result.is_err());
}
}
To test a phase hook directly, build a minimal PhaseContext and inspect the returned StateCommand:
#[tokio::test]
async fn audit_hook_appends_entry() {
let hook = AuditHook;
let ctx = PhaseContext::test_default();
let cmd = hook.run(&ctx).await.unwrap();
// Commit the command to a test store and verify
let store = StateStore::new();
store.install_plugin(AuditPlugin).unwrap();
store.commit(cmd).unwrap();
let log = store.read::<AuditLogKey>().unwrap();
assert!(!log.entries.is_empty());
}
3. Unit testing a StateKey
Test apply() mutations directly without any runtime overhead:
use awaken::state::{StateKey, MergeStrategy};
struct HitCounter;
impl StateKey for HitCounter {
const KEY: &'static str = "test.hit_counter";
const MERGE: MergeStrategy = MergeStrategy::Commutative;
type Value = u64;
type Update = u64;
fn apply(value: &mut Self::Value, update: Self::Update) {
*value += update;
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn apply_increments_counter() {
let mut value = u64::default();
HitCounter::apply(&mut value, 5);
assert_eq!(value, 5);
HitCounter::apply(&mut value, 3);
assert_eq!(value, 8);
}
#[test]
fn apply_from_default_is_identity_free() {
let mut value = u64::default();
HitCounter::apply(&mut value, 0);
assert_eq!(value, 0);
}
}
For keys with complex value types, test edge cases like empty collections or merge conflicts:
#[test]
fn apply_merge_replaces_entries() {
let mut log = AuditLog { entries: vec!["old".into()] };
AuditLogKey::apply(&mut log, AuditLog { entries: vec!["new".into()] });
// Exclusive merge replaces the entire value
assert_eq!(log.entries, vec!["new"]);
}
4. Integration testing with a mock LLM
Build a full agent runtime with a scripted LlmExecutor that returns canned responses. This is the primary pattern used in the awaken-runtime integration tests.
use std::sync::{Arc, Mutex};
use async_trait::async_trait;
use serde_json::json;
use awaken::contract::content::ContentBlock;
use awaken::contract::event_sink::VecEventSink;
use awaken::contract::executor::{InferenceExecutionError, InferenceRequest, LlmExecutor};
use awaken::contract::identity::{RunIdentity, RunOrigin};
use awaken::contract::inference::{StopReason, StreamResult};
use awaken::contract::message::{Message, ToolCall};
use awaken::contract::tool::{
Tool, ToolCallContext, ToolDescriptor, ToolError, ToolOutput, ToolResult,
};
use awaken::loop_runner::{AgentLoopParams, LoopStatePlugin, build_agent_env, run_agent_loop};
use awaken::registry::{AgentResolver, ResolvedAgent};
use awaken::state::StateStore;
use awaken::phase::PhaseRuntime;
use awaken::RuntimeError;
// -- Scripted LLM executor --
struct ScriptedLlm {
responses: Mutex<Vec<StreamResult>>,
}
impl ScriptedLlm {
fn new(responses: Vec<StreamResult>) -> Self {
Self {
responses: Mutex::new(responses),
}
}
}
#[async_trait]
impl LlmExecutor for ScriptedLlm {
async fn execute(
&self,
_request: InferenceRequest,
) -> Result<StreamResult, InferenceExecutionError> {
let mut responses = self.responses.lock().unwrap();
if responses.is_empty() {
// Fallback: end the conversation
Ok(StreamResult {
content: vec![ContentBlock::text("Done.")],
tool_calls: vec![],
usage: None,
stop_reason: Some(StopReason::EndTurn),
has_incomplete_tool_calls: false,
})
} else {
Ok(responses.remove(0))
}
}
fn name(&self) -> &str {
"scripted"
}
}
// -- Resolver --
struct FixedResolver {
agent: ResolvedAgent,
}
impl AgentResolver for FixedResolver {
fn resolve(&self, _agent_id: &str) -> Result<ResolvedAgent, RuntimeError> {
let mut agent = self.agent.clone();
agent.env = build_agent_env(&[], &agent)?;
Ok(agent)
}
}
// -- Helpers --
fn tool_step(calls: Vec<ToolCall>) -> StreamResult {
StreamResult {
content: vec![],
tool_calls: calls,
usage: None,
stop_reason: Some(StopReason::ToolUse),
has_incomplete_tool_calls: false,
}
}
fn text_step(text: &str) -> StreamResult {
StreamResult {
content: vec![ContentBlock::text(text)],
tool_calls: vec![],
usage: None,
stop_reason: Some(StopReason::EndTurn),
has_incomplete_tool_calls: false,
}
}
fn test_identity() -> RunIdentity {
RunIdentity::new(
"thread-test".into(),
None,
"run-test".into(),
None,
"agent".into(),
RunOrigin::User,
)
}
// -- Test --
#[tokio::test]
async fn tool_call_flow_end_to_end() {
// Script: LLM calls get_weather, then responds with text
let llm = Arc::new(ScriptedLlm::new(vec![
tool_step(vec![ToolCall::new("c1", "get_weather", json!({"city": "Tokyo"}))]),
text_step("The weather in Tokyo is sunny."),
]));
let agent = ResolvedAgent::new("test", "model", "You are helpful.", llm)
.with_tool(Arc::new(GetWeatherTool));
let store = StateStore::new();
let runtime = PhaseRuntime::new(store.clone()).unwrap();
store.install_plugin(LoopStatePlugin).unwrap();
let resolver = FixedResolver { agent };
let sink = Arc::new(VecEventSink::new());
let result = run_agent_loop(AgentLoopParams {
resolver: &resolver,
agent_id: "test",
runtime: &runtime,
sink: sink.clone(),
checkpoint_store: None,
messages: vec![Message::user("What's the weather?")],
run_identity: test_identity(),
cancellation_token: None,
decision_rx: None,
overrides: None,
frontend_tools: Vec::new(),
})
.await
.unwrap();
assert_eq!(result.response, "The weather in Tokyo is sunny.");
assert_eq!(result.steps, 2); // tool step + text step
}
For simpler cases, use the built-in MockLlmExecutor which returns text-only responses:
use awaken::engine::MockLlmExecutor;
let llm = Arc::new(MockLlmExecutor::new().with_responses(vec!["Hello!".into()]));
let agent = ResolvedAgent::new("test", "model", "system prompt", llm);
5. Testing event streams
Use VecEventSink to capture all events emitted during a run and assert on their sequence and content.
use awaken::contract::event::AgentEvent;
use awaken::contract::event_sink::VecEventSink;
use awaken::contract::lifecycle::TerminationReason;
#[tokio::test]
async fn events_follow_expected_lifecycle() {
// ... set up runtime and run agent (see section 4) ...
let events = sink.take();
// Verify ordering: RunStart -> StepStart -> ... -> StepEnd -> RunFinish
assert!(matches!(events.first(), Some(AgentEvent::RunStart { .. })));
assert!(matches!(events.last(), Some(AgentEvent::RunFinish { .. })));
// Count specific event types
let step_starts = events.iter().filter(|e| matches!(e, AgentEvent::StepStart { .. })).count();
let step_ends = events.iter().filter(|e| matches!(e, AgentEvent::StepEnd)).count();
assert_eq!(step_starts, step_ends, "every StepStart needs a StepEnd");
// Verify termination reason
if let Some(AgentEvent::RunFinish { termination, .. }) = events.last() {
assert_eq!(*termination, TerminationReason::NaturalEnd);
}
// Check that tool call events appear in the correct order
let has_tool_start = events.iter().any(|e| matches!(e, AgentEvent::ToolCallStart { .. }));
let has_tool_done = events.iter().any(|e| matches!(e, AgentEvent::ToolCallDone { .. }));
if has_tool_start {
assert!(has_tool_done, "ToolCallStart without ToolCallDone");
let start_idx = events.iter().position(|e| matches!(e, AgentEvent::ToolCallStart { .. })).unwrap();
let done_idx = events.iter().position(|e| matches!(e, AgentEvent::ToolCallDone { .. })).unwrap();
assert!(start_idx < done_idx);
}
}
A reusable helper for event type extraction (used in the runtime test suite):
fn event_type(e: &AgentEvent) -> &'static str {
match e {
AgentEvent::RunStart { .. } => "run_start",
AgentEvent::RunFinish { .. } => "run_finish",
AgentEvent::StepStart { .. } => "step_start",
AgentEvent::StepEnd => "step_end",
AgentEvent::TextDelta { .. } => "text_delta",
AgentEvent::ToolCallStart { .. } => "tool_call_start",
AgentEvent::ToolCallDone { .. } => "tool_call_done",
AgentEvent::InferenceComplete { .. } => "inference_complete",
AgentEvent::StateSnapshot { .. } => "state_snapshot",
_ => "other",
}
}
let types: Vec<&str> = events.iter().map(event_type).collect();
assert_eq!(types[0], "run_start");
assert_eq!(*types.last().unwrap(), "run_finish");
6. Testing with a real LLM (live tests)
For end-to-end validation against a real provider, use the GenaiExecutor with environment variables for credentials. Mark these tests with #[ignore] so they only run when explicitly requested.
use awaken::engine::GenaiExecutor;
#[tokio::test]
#[ignore] // Run with: cargo test -- --ignored
async fn live_llm_responds() {
// Requires: OPENAI_API_KEY or (LLM_BASE_URL + LLM_API_KEY)
let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o-mini".into());
let llm = Arc::new(GenaiExecutor::new());
let agent = ResolvedAgent::new(
"live-test",
&model,
"You are a test assistant. Answer in one word.",
llm,
);
// ... set up resolver, store, runtime, sink as in section 4 ...
let result = run_agent_loop(AgentLoopParams {
resolver: &resolver,
agent_id: "live-test",
runtime: &runtime,
sink: sink.clone(),
checkpoint_store: None,
messages: vec![Message::user("What is 2+2? Answer in one word.")],
run_identity: test_identity(),
cancellation_token: None,
decision_rx: None,
overrides: None,
frontend_tools: Vec::new(),
})
.await
.unwrap();
assert!(!result.response.is_empty());
}
Run live tests with:
# OpenAI-compatible provider
OPENAI_API_KEY=<your-key> LLM_MODEL=gpt-4o-mini cargo test -- --ignored
# Custom endpoint (e.g. BigModel)
LLM_BASE_URL=https://open.bigmodel.cn/api/paas/v4/ \
LLM_API_KEY=<key> \
LLM_MODEL=GLM-4.7-Flash \
cargo test -- --ignored
See examples/live_test.rs and examples/tool_call_live.rs for complete working examples with console output.
Key Files
crates/awaken-contract/src/contract/tool.rs–Tooltrait,ToolCallContext::test_default(),ToolResult,ToolOutputcrates/awaken-contract/src/contract/event_sink.rs–VecEventSinkcrates/awaken-runtime/src/engine/mock.rs–MockLlmExecutorcrates/awaken-runtime/src/state/mod.rs–StateStore,StateCommandcrates/awaken-runtime/src/loop_runner/mod.rs–run_agent_loop,AgentLoopParams,AgentRunResultcrates/awaken-runtime/tests/– integration test suite (event lifecycle, tool side effects)
Related
Use Shared State
Use this when agents need to share persistent state across thread boundaries, agent types, or delegation trees. Shared state lives in the ProfileStore and is addressed by a typed namespace (ProfileKey) and a key (&str), giving you fine-grained control over who sees what.
Prerequisites
- A working awaken agent runtime (see First Agent)
- A
ProfileStorebackend configured on the runtime (e.g. file store or Postgres)
Concepts
Shared state has two dimensions:
| Dimension | Type | Purpose |
|---|---|---|
| Namespace | ProfileKey | Defines what is stored — a compile-time binding between a static string key (KEY) and a typed Value. Each key is registered once per plugin via register_profile_key. |
| Key | &str (or StateScope helper) | Defines which instance — a runtime string that partitions storage. Different keys isolate or share data between agents and threads. |
Together, (ProfileKey::KEY, key: &str) uniquely identifies a shared state entry in the profile store.
Steps
1. Define a shared state key
Create a struct that implements ProfileKey. The KEY constant is the namespace; the Value type is what gets serialized.
use serde::{Deserialize, Serialize};
use awaken_contract::ProfileKey;
#[derive(Clone, Default, Serialize, Deserialize)]
pub struct TeamContext {
pub goal: String,
pub constraints: Vec<String>,
}
pub struct TeamContextKey;
impl ProfileKey for TeamContextKey {
const KEY: &'static str = "team_context";
type Value = TeamContext;
}
2. Register in a plugin
Inside your plugin’s register method, call register_profile_key on the registrar.
use awaken_contract::StateError;
use awaken_runtime::plugins::registry::PluginRegistrar;
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_profile_key::<TeamContextKey>()?;
Ok(())
}
3. Read and write in a hook
In any phase hook, obtain ProfileAccess from the context and use read / write with a key string. StateScope is a convenience builder for common key patterns — call .as_str() to get the key.
use awaken_contract::StateScope;
async fn execute(&self, ctx: &mut PhaseContext) -> Result<(), anyhow::Error> {
let profile = ctx.profile().expect("ProfileStore not configured");
let identity = ctx.snapshot().run_identity();
// Build a scope key from the current agent's parent thread
let scope = match &identity.parent_thread_id {
Some(pid) => StateScope::parent_thread(pid),
None => StateScope::global(),
};
// Read (returns TeamContext::default() if missing)
let mut team: TeamContext = profile.read::<TeamContextKey>(scope.as_str()).await?;
// Mutate and write back
team.goal = "Ship the feature".into();
profile.write::<TeamContextKey>(scope.as_str(), &team).await?;
Ok(())
}
4. Choose the right scope
StateScope has several constructors. Pick the one that matches your sharing pattern:
| Scenario | Scope | Example |
|---|---|---|
| All agents across all threads | StateScope::global() | Org-wide configuration |
| All agents spawned from the same parent thread | StateScope::parent_thread(id) | A delegation tree sharing context |
| All instances of the same agent type | StateScope::agent_type(name) | Planner agents sharing learned heuristics |
| Single thread only | StateScope::thread(id) | Thread-local scratchpad |
| Custom partition | StateScope::new("custom-key") | Any application-defined grouping |
You can also pass any raw &str directly — StateScope is optional convenience.
When to use shared state
| Mechanism | Lifetime | Scope | Best for |
|---|---|---|---|
StateKey | Single run (in-memory snapshot) | One agent thread | Transient per-run state (counters, flags, accumulated context) |
ProfileKey with agent/system key | Persistent (profile store) | Per-agent or system | Per-agent or per-user settings that don’t cross boundaries |
ProfileKey with StateScope key | Persistent (profile store) | Any StateScope string | Cross-agent, cross-thread persistent state |
Use ProfileKey with a StateScope key when state must survive across runs and be visible to agents in different threads or of different types.
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
profile key not registered: <ns> | Key not registered in any plugin | Call r.register_profile_key::<YourKey>() in the plugin’s register method |
Always reads Value::default() | Writing and reading use different key strings | Verify both sides construct the same StateScope or use the same &str key |
| Data leaks between scopes | Using StateScope::global() when a narrower scope is needed | Switch to parent_thread, agent_type, or thread scope |
Key Files
| Path | Purpose |
|---|---|
crates/awaken-contract/src/contract/shared_state.rs | StateScope type |
crates/awaken-contract/src/contract/profile_store.rs | ProfileKey trait, ProfileOwner enum |
crates/awaken-runtime/src/profile/mod.rs | ProfileAccess with read, write, delete methods |
crates/awaken-runtime/src/plugins/registry.rs | PluginRegistrar::register_profile_key registration |
Related
Overview
The awaken crate is the public facade for the Awaken agent framework. It
re-exports types from the internal awaken-contract and awaken-runtime crates
so that downstream code only needs a single dependency.
Module re-exports
| Facade path | Source crate | Contents |
|---|---|---|
awaken::contract | awaken-contract | Tool trait, events, messages, suspension, lifecycle |
awaken::model | awaken-contract | Phase, EffectSpec, ScheduledActionSpec, JsonValue |
awaken::registry_spec | awaken-contract | AgentSpec, ModelSpec, ProviderSpec, McpServerSpec, PluginConfigKey |
awaken::state | awaken-contract + awaken-runtime | StateKey, StateMap, Snapshot, StateStore, MutationBatch |
awaken::agent | awaken-runtime | Agent configuration and state |
awaken::builder | awaken-runtime | AgentRuntimeBuilder, BuildError |
awaken::context | awaken-runtime | PhaseContext |
awaken::engine | awaken-runtime | LLM engine abstraction |
awaken::execution | awaken-runtime | ExecutionEnv |
awaken::extensions | awaken-runtime | Built-in extension infrastructure |
awaken::loop_runner | awaken-runtime | Agent loop runner |
awaken::phase | awaken-runtime | PhaseRuntime, PhaseHook |
awaken::plugins | awaken-runtime | Plugin, PluginDescriptor, PluginRegistrar |
awaken::policies | awaken-runtime | Context window and retry policies |
awaken::registry | awaken-runtime | AgentResolver, ResolvedAgent |
awaken::runtime | awaken-runtime | AgentRuntime |
awaken::stores | awaken-stores | File and Postgres store implementations |
Feature-gated modules
| Facade path | Feature flag | Source crate |
|---|---|---|
awaken::ext_permission | permission | awaken-ext-permission |
awaken::ext_observability | observability | awaken-ext-observability |
awaken::ext_mcp | mcp | awaken-ext-mcp |
awaken::ext_skills | skills | awaken-ext-skills |
awaken::ext_generative_ui | generative-ui | awaken-ext-generative-ui |
awaken::ext_reminder | reminder | awaken-ext-reminder |
awaken::server | server | awaken-server |
Root-level re-exports
The following types are re-exported at the crate root for convenience:
From awaken-contract:
AgentSpec, EffectSpec, FailedScheduledActions, JsonValue, KeyScope,
MergeStrategy, PendingScheduledActions, PersistedState, Phase,
PluginConfigKey, ScheduledActionSpec, Snapshot, StateError, StateKey,
StateKeyOptions, StateMap, TypedEffect, UnknownKeyPolicy
From awaken-runtime:
AgentResolver, AgentRuntime, AgentRuntimeBuilder, BuildError,
CancellationToken, CommitEvent, CommitHook, DEFAULT_MAX_PHASE_ROUNDS,
ExecutionEnv, MutationBatch, PhaseContext, PhaseHook, PhaseRuntime,
Plugin, PluginDescriptor, PluginRegistrar, ResolvedAgent, RunRequest,
RuntimeError, StateCommand, StateStore, TypedEffectHandler,
TypedScheduledActionHandler
Feature flags
| Flag | Default | Description |
|---|---|---|
permission | yes | Tool-level permission gating (HITL) |
observability | yes | Tracing and metrics integration |
mcp | yes | MCP (Model Context Protocol) tool bridge |
skills | yes | Skills subsystem for reusable agent capabilities |
reminder | yes | Reminder extension for injecting context messages |
server | yes | HTTP server with SSE streaming and protocol adapters |
generative-ui | yes | Generative UI component streaming |
full | yes | Enables all of the above |
Related
Tool Trait
The Tool trait is the primary extension point for giving agents capabilities.
Tools are stateless functions that receive JSON arguments and a read-only context,
and return a ToolOutput.
Trait definition
#[async_trait]
pub trait Tool: Send + Sync {
/// Return the descriptor for this tool.
fn descriptor(&self) -> ToolDescriptor;
/// Validate arguments before execution. Default: accept all.
fn validate_args(&self, _args: &Value) -> Result<(), ToolError> {
Ok(())
}
/// Execute the tool with the given arguments and context.
async fn execute(
&self,
args: Value,
ctx: &ToolCallContext,
) -> Result<ToolOutput, ToolError>;
}
Crate path: awaken::contract::tool::Tool
ToolDescriptor
Describes a tool’s identity and parameter schema. Registered with the runtime and sent to the LLM as available functions.
pub struct ToolDescriptor {
pub id: String,
pub name: String,
pub description: String,
/// JSON Schema for parameters.
pub parameters: Value,
pub category: Option<String>,
}
Builder methods:
ToolDescriptor::new(id, name, description) -> Self
.with_parameters(schema: Value) -> Self
.with_category(category: impl Into<String>) -> Self
ToolResult
Returned by Tool::execute. Carries the execution outcome back to the agent loop.
pub struct ToolResult {
pub tool_name: String,
pub status: ToolStatus,
pub data: Value,
pub message: Option<String>,
pub suspension: Option<Box<SuspendTicket>>,
}
ToolStatus
pub enum ToolStatus {
Success, // Execution succeeded
Pending, // Suspended, waiting for external resume
Error, // Execution failed; content sent back to LLM
}
Constructors
| Method | Status | Use case |
|---|---|---|
ToolResult::success(name, data) | Success | Normal completion |
ToolResult::success_with_message(name, data, msg) | Success | Completion with description |
ToolResult::error(name, message) | Error | Recoverable failure |
ToolResult::error_with_code(name, code, message) | Error | Structured error with code |
ToolResult::suspended(name, message) | Pending | HITL suspension |
ToolResult::suspended_with(name, message, ticket) | Pending | Suspension with ticket |
Predicates
is_success() -> boolis_pending() -> boolis_error() -> boolto_json() -> Value
ToolError
Errors returned from validate_args or execute. Unlike ToolResult::Error
(which is sent to the LLM), a ToolError aborts the tool call.
pub enum ToolError {
InvalidArguments(String),
ExecutionFailed(String),
Denied(String),
NotFound(String),
Internal(String),
}
ToolCallContext
Read-only context provided to a tool during execution.
pub struct ToolCallContext {
pub call_id: String,
pub tool_name: String,
pub run_identity: RunIdentity,
pub agent_spec: Arc<AgentSpec>,
pub snapshot: Snapshot,
pub activity_sink: Option<Arc<dyn EventSink>>,
}
Methods
/// Read a typed state key from the snapshot.
fn state<K: StateKey>(&self) -> Option<&K::Value>
/// Report an activity snapshot for this tool call.
async fn report_activity(&self, activity_type: &str, content: &str)
/// Report an incremental activity delta.
async fn report_activity_delta(&self, activity_type: &str, patch: Value)
/// Report structured tool call progress.
async fn report_progress(
&self,
status: ProgressStatus,
message: Option<&str>,
progress: Option<f64>,
)
Examples
Minimal tool
use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use serde_json::{Value, json};
struct Greet;
#[async_trait]
impl Tool for Greet {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("greet", "greet", "Greet a user by name")
.with_parameters(json!({
"type": "object",
"properties": {
"name": { "type": "string" }
},
"required": ["name"]
}))
}
async fn execute(
&self,
args: Value,
_ctx: &ToolCallContext,
) -> Result<ToolOutput, ToolError> {
let name = args["name"]
.as_str()
.ok_or_else(|| ToolError::InvalidArguments("name required".into()))?;
Ok(ToolResult::success("greet", json!({ "greeting": format!("Hello, {name}!") })).into())
}
}
Reading state from context
use async_trait::async_trait;
use awaken::contract::tool::{Tool, ToolCallContext, ToolDescriptor, ToolError, ToolResult, ToolOutput};
use awaken::state::StateKey;
use serde::{Serialize, Deserialize};
use serde_json::{Value, json};
// Assume a state key is defined elsewhere:
// struct UserPreferences;
// impl StateKey for UserPreferences { ... }
struct GetPreferences;
#[async_trait]
impl Tool for GetPreferences {
fn descriptor(&self) -> ToolDescriptor {
ToolDescriptor::new("get_prefs", "get_preferences", "Get user preferences")
}
async fn execute(
&self,
_args: Value,
ctx: &ToolCallContext,
) -> Result<ToolOutput, ToolError> {
// Read typed state through the snapshot
// let prefs = ctx.state::<UserPreferences>().cloned().unwrap_or_default();
// Ok(ToolResult::success("get_prefs", serde_json::to_value(&prefs).unwrap()).into())
Ok(ToolResult::success("get_prefs", json!({})).into())
}
}
Tool execution hooks
Every tool call passes through plugin hooks before and after execution. This allows plugins to intercept, observe, or modify tool call behavior without changing the tool itself.
Full lifecycle
LLM selects tool
-> validate_args()
-> BeforeToolExecute phase (plugins run hooks)
Plugins may schedule ToolInterceptAction:
Block -> run terminates with reason
Suspend -> run pauses (HITL), tool not executed
SetResult -> tool skipped, pre-built result used
-> execute() (only if not intercepted)
-> AfterToolExecute phase (plugins run hooks)
Plugins observe the ToolResult and may modify state
BeforeToolExecute
Runs once per tool call, after argument validation. Plugins receive a
PhaseContext containing the tool name, call ID, and validated arguments.
Hooks return a StateCommand that may schedule ToolInterceptAction to
block, suspend, or short-circuit the call.
When multiple intercepts are scheduled, priority is: Block > Suspend > SetResult.
AfterToolExecute
Runs after execute() completes (or after an intercept produces a result).
Plugins receive the PhaseContext with the ToolResult attached. Hooks can
update state, emit events, or schedule actions for subsequent phases.
ToolCallStatus transitions
Each tool call tracks a ToolCallStatus through its lifecycle:
New -> Running -> Succeeded
Failed
Suspended -> Resuming -> Running -> ...
Cancelled
Terminal states (Succeeded, Failed, Cancelled) cannot transition further.
See Plugin Internals for intercept priority details and the full phase convergence loop.
Related
Scheduled Actions
Scheduled actions are the primary mechanism for plugins and tools to request
side-effects during a phase execution cycle. Any hook, tool, or external module
can schedule an action via StateCommand::schedule_action::<A>(payload). The
runtime dispatches the action to its registered handler during the EXECUTE stage
of the target phase.
How it works
Hook / Tool Runtime
| |
|-- StateCommand ----------->| (contains scheduled_actions)
| schedule_action::<A>(p) |
| |-- commit state updates
| |-- dispatch to handler(A, p)
| | |
| | |-- handler returns StateCommand
| | | (may schedule more actions)
| |<-----'
| |-- commit handler results
Scheduling from a hook
use awaken_runtime::agent::state::ExcludeTool;
async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
let mut cmd = StateCommand::new();
cmd.schedule_action::<ExcludeTool>("dangerous_tool".into())?;
Ok(cmd)
}
Scheduling from a tool
use awaken_runtime::agent::state::AddContextMessage;
use awaken_contract::contract::context_message::ContextMessage;
async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
let mut cmd = StateCommand::new();
cmd.schedule_action::<AddContextMessage>(
ContextMessage::system("my_tool.hint", "Remember to check the docs."),
)?;
Ok(ToolOutput::with_command(
ToolResult::success("my_tool", json!({"ok": true})),
cmd,
))
}
Core Actions (awaken-runtime)
These are always available. Registered by the internal LoopActionHandlersPlugin.
AddContextMessage
| Key | runtime.add_context_message |
| Phase | BeforeInference |
| Payload | ContextMessage |
| Import | awaken_runtime::agent::state::AddContextMessage |
Injects a context message into the LLM conversation for the current step. Messages can be persistent (survive across steps), ephemeral (one-shot), or throttled (cooldown-based).
Used by: skills plugin (skill catalog), reminder plugin (rule-based hints), deferred-tools plugin (deferred tool list), custom hooks.
cmd.schedule_action::<AddContextMessage>(
ContextMessage::system_persistent("my_plugin.info", "Always verify inputs."),
)?;
SetInferenceOverride
| Key | runtime.set_inference_override |
| Phase | BeforeInference |
| Payload | InferenceOverride |
| Import | awaken_runtime::agent::state::SetInferenceOverride |
Overrides inference parameters (model, temperature, max_tokens, top_p) for the current step only. Multiple overrides are merged; last-writer-wins per field.
cmd.schedule_action::<SetInferenceOverride>(InferenceOverride {
temperature: Some(0.0), // force deterministic
..Default::default()
})?;
ExcludeTool
| Key | runtime.exclude_tool |
| Phase | BeforeInference |
| Payload | String (tool ID) |
| Import | awaken_runtime::agent::state::ExcludeTool |
Removes a tool from the set offered to the LLM for the current step. Multiple exclusions are additive.
Used by: permission plugin (unconditionally denied tools), deferred-tools plugin (deferred tools replaced by ToolSearch).
cmd.schedule_action::<ExcludeTool>("rm".into())?;
IncludeOnlyTools
| Key | runtime.include_only_tools |
| Phase | BeforeInference |
| Payload | Vec<String> (tool IDs) |
| Import | awaken_runtime::agent::state::IncludeOnlyTools |
Restricts the tool set to only the listed IDs. Multiple IncludeOnlyTools
actions are unioned. Combined with ExcludeTool, exclusions are applied after
the include-only filter.
cmd.schedule_action::<IncludeOnlyTools>(vec!["search".into(), "calculator".into()])?;
ToolInterceptAction
| Key | tool_intercept |
| Phase | BeforeToolExecute |
| Payload | ToolInterceptPayload |
| Import | awaken_contract::contract::tool_intercept::ToolInterceptAction |
Intercepts a tool call before execution. Three outcomes:
| Variant | Effect |
|---|---|
Block { reason } | Tool execution blocked, run terminates with the reason. |
Suspend(SuspendTicket) | Tool execution deferred; run pauses awaiting external decision (HITL). |
SetResult(ToolResult) | Tool execution short-circuited with a pre-built result. |
When multiple intercepts are scheduled, priority is: Block > Suspend > SetResult.
Used by: permission plugin (ask/deny policies).
use awaken_contract::contract::tool_intercept::{ToolInterceptAction, ToolInterceptPayload};
cmd.schedule_action::<ToolInterceptAction>(
ToolInterceptPayload::Block { reason: "Tool is disabled".into() },
)?;
Deferred-Tools Actions (awaken-ext-deferred-tools)
Available when the deferred-tools plugin is active.
DeferToolAction
| Key | deferred_tools.defer |
| Phase | BeforeInference |
| Payload | Vec<String> (tool IDs) |
| Import | awaken_ext_deferred_tools::state::DeferToolAction |
Moves tools from eager to deferred mode. Deferred tools are excluded from the LLM tool set and made available via ToolSearch instead, reducing prompt token usage.
The handler updates DeferralState, setting each tool’s mode to Deferred.
cmd.schedule_action::<DeferToolAction>(vec!["rarely_used_tool".into()])?;
PromoteToolAction
| Key | deferred_tools.promote |
| Phase | BeforeInference |
| Payload | Vec<String> (tool IDs) |
| Import | awaken_ext_deferred_tools::state::PromoteToolAction |
Moves tools from deferred to eager mode. Promoted tools are included in the LLM tool set for subsequent steps.
The handler updates DeferralState, setting each tool’s mode to Eager.
Typically triggered automatically when ToolSearch returns results, but can be scheduled manually by any plugin or tool.
cmd.schedule_action::<PromoteToolAction>(vec!["needed_tool".into()])?;
Plugin Action Usage Matrix
Which plugins schedule which actions:
| Plugin | AddContext | SetOverride | Exclude | IncludeOnly | Intercept | Defer | Promote |
|---|---|---|---|---|---|---|---|
| permission | X | X | |||||
| skills | X | ||||||
| reminder | X | ||||||
| deferred-tools | X | X | X | X | |||
| observability | |||||||
| mcp | |||||||
| generative-ui |
Defining Custom Actions
Plugins can define their own actions by implementing ScheduledActionSpec and
registering a handler via PluginRegistrar::register_scheduled_action.
use awaken_contract::model::{Phase, ScheduledActionSpec};
pub struct MyCustomAction;
impl ScheduledActionSpec for MyCustomAction {
const KEY: &'static str = "my_plugin.custom_action";
const PHASE: Phase = Phase::BeforeInference;
type Payload = MyPayload;
}
Then in your plugin’s register():
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_scheduled_action::<MyCustomAction, _>(MyHandler)?;
Ok(())
}
Other plugins and tools can then schedule your action:
cmd.schedule_action::<MyCustomAction>(my_payload)?;
Convergence and cascading
Scheduled actions execute within the phase convergence loop. After each round of action dispatch, the runtime checks whether new actions were produced. If so, the loop repeats to process them.
How the loop works
Phase EXECUTE stage:
round 1: dispatch queued actions -> handlers return StateCommands
commit state, collect newly scheduled actions
round 2: dispatch new actions -> handlers may schedule more
...
round N: no new actions -> phase converges, loop exits
An action handler can schedule new actions for the same phase, which causes another round. This enables cascading behaviors (e.g., a handler adds a context message, which triggers a filter action from another plugin).
Limits
The loop is bounded by DEFAULT_MAX_PHASE_ROUNDS (16). If actions are still
being produced after 16 rounds, the runtime returns a
StateError::PhaseRunLoopExceeded error with the phase name and round count.
This prevents infinite loops from misconfigured or recursive handlers.
Failed actions
When an action handler returns an error, the action is not retried. Instead,
it is recorded in the FailedScheduledActions state key, which holds a list
of FailedScheduledAction entries (action key, payload, and error message).
Plugins or tests can inspect this key to detect handler failures.
let failed = store.read::<FailedScheduledActions>().unwrap_or_default();
assert!(failed.is_empty(), "expected no failed actions");
See Plugin Internals for the full convergence loop description and phase execution model.
Effects
Effects are typed, fire-and-forget side-effect events. Unlike
scheduled actions (which execute within a phase
convergence loop and can cascade), effects are dispatched after commit and
are terminal – handlers cannot produce new StateCommands, actions, or
effects.
Typical use cases: audit logging, external webhook calls, metric emission, notification delivery.
EffectSpec trait
Every effect type implements EffectSpec:
pub trait EffectSpec: 'static + Send + Sync {
/// Unique string identifier for this effect kind.
const KEY: &'static str;
/// The payload carried by the effect.
/// Must be serializable so the runtime can store it as JSON internally.
type Payload: Serialize + DeserializeOwned + Send + Sync + 'static;
}
Crate path: awaken::model::EffectSpec (re-exported from awaken-contract)
KEY must be globally unique across all registered effects. Convention:
"<plugin>.<effect_name>", e.g. "audit.record".
Emitting effects
Call StateCommand::emit::<E>(payload) from any hook or action handler:
use awaken::{StateCommand, StateError};
async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
let mut cmd = StateCommand::new();
cmd.emit::<AuditEffect>(AuditPayload {
action: "user_login".into(),
actor: "agent-1".into(),
})?;
Ok(cmd)
}
Effects are collected in StateCommand::effects and dispatched only after the
command’s state mutations are committed. Tools can also emit effects by including
them in the StateCommand returned alongside a ToolResult.
TypedEffect wrapper
TypedEffect is the runtime’s type-erased envelope for effects:
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct TypedEffect {
pub key: String,
pub payload: JsonValue,
}
Two methods bridge between the typed and erased worlds:
TypedEffect::from_spec::<E>(payload)– serializes a typed payload into aTypedEffect. Called internally byStateCommand::emit.TypedEffect::decode::<E>()– deserializes the JSON payload back into the concreteE::Payloadtype.
You rarely need to use TypedEffect directly; StateCommand::emit and the
handler trait handle serialization transparently.
Registering effect handlers
Effect handlers implement TypedEffectHandler<E>:
#[async_trait]
pub trait TypedEffectHandler<E>: Send + Sync + 'static
where
E: EffectSpec,
{
async fn handle_typed(
&self,
payload: E::Payload,
snapshot: &Snapshot,
) -> Result<(), String>;
}
Key points:
- The handler receives the post-commit
Snapshot, so it sees the state that includes the mutations from the command that emitted the effect. - The return type is
Result<(), String>– notResult<(), StateError>. Handlers report errors as plain strings; the runtime logs them but does not propagate them.
Register a handler in your plugin’s register() method:
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_effect::<AuditEffect, _>(AuditEffectHandler)?;
Ok(())
}
Duplicate registrations (same E::KEY) produce
StateError::EffectHandlerAlreadyRegistered.
Dispatch lifecycle
-
Collect – A hook, action handler, or tool calls
cmd.emit::<E>(payload). TheTypedEffectis appended toStateCommand::effects. -
Validate – When
submit_commandprocesses the command, every effect key is checked against the registered handlers. If any key has no registered handler, the command is rejected withStateError::UnknownEffectHandlerbefore any state is committed. This is a fail-fast guarantee. -
Commit – State mutations (
MutationBatch) are committed to the store. -
Dispatch – After a successful commit, each effect is dispatched to its handler via
handle_typed(payload, snapshot). The snapshot reflects post-commit state. -
Error handling – Handler failures are logged and counted in
EffectDispatchReportbut do not roll back the commit or block subsequent effects. The runtime continues dispatching remaining effects.
Hook / Tool Runtime
| |
|-- StateCommand (with effects) ->|
| |-- validate all effect keys
| | (fail-fast if unknown)
| |-- commit state mutations
| |-- dispatch effects sequentially
| | handler(payload, snapshot)
| |-- return SubmitCommandReport
|<--------------------------------|
Worked example
Define an effect:
use awaken::EffectSpec;
use serde::{Deserialize, Serialize};
/// Payload for audit log entries.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditPayload {
pub action: String,
pub actor: String,
}
/// Effect spec for audit logging.
pub struct AuditEffect;
impl EffectSpec for AuditEffect {
const KEY: &'static str = "audit.record";
type Payload = AuditPayload;
}
Emit it from a phase hook:
use async_trait::async_trait;
use awaken::{PhaseContext, PhaseHook, StateCommand, StateError};
pub struct AuditHook;
#[async_trait]
impl PhaseHook for AuditHook {
async fn run(&self, ctx: &PhaseContext) -> Result<StateCommand, StateError> {
let mut cmd = StateCommand::new();
cmd.emit::<AuditEffect>(AuditPayload {
action: "phase_entered".into(),
actor: "system".into(),
})?;
Ok(cmd)
}
}
Handle the effect:
use async_trait::async_trait;
use awaken::{Snapshot, TypedEffectHandler};
pub struct AuditEffectHandler;
#[async_trait]
impl TypedEffectHandler<AuditEffect> for AuditEffectHandler {
async fn handle_typed(
&self,
payload: AuditPayload,
_snapshot: &Snapshot,
) -> Result<(), String> {
tracing::info!(
action = %payload.action,
actor = %payload.actor,
"audit effect dispatched"
);
// In production: write to external audit store, send webhook, etc.
Ok(())
}
}
Wire it all together in a plugin:
use awaken::{Plugin, PluginDescriptor, PluginRegistrar, StateError};
use awaken::model::Phase;
pub struct AuditPlugin;
impl Plugin for AuditPlugin {
fn descriptor(&self) -> PluginDescriptor {
PluginDescriptor::new("audit", "Audit logging via effects")
}
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_effect::<AuditEffect, _>(AuditEffectHandler)?;
r.register_phase_hook("audit", Phase::RunStart, AuditHook)?;
Ok(())
}
}
Effects vs Scheduled Actions
| Effects | Scheduled Actions | |
|---|---|---|
| Timing | Post-commit | Within phase convergence loop |
| Can cascade | No | Yes (handlers return StateCommand) |
| Can produce StateCommand | No | Yes |
| Failure handling | Logged, non-blocking | Error propagated to caller |
| State visibility | Post-commit snapshot | Pre-commit context |
| Use case | External I/O, logging, metrics | Internal control flow, state manipulation |
Choose effects when you need to trigger external side-effects that should not influence the agent’s state convergence. Choose scheduled actions when the handler needs to mutate state or schedule further work.
See also
Events
The agent loop emits AgentEvent values as it executes. Events are streamed to
clients via SSE and consumed by protocol encoders.
AgentEvent
All variants are tagged with event_type in their JSON serialization
(#[serde(tag = "event_type", rename_all = "snake_case")]).
pub enum AgentEvent {
RunStart {
thread_id: String,
run_id: String,
parent_run_id: Option<String>, // omitted when None
},
RunFinish {
thread_id: String,
run_id: String,
result: Option<Value>, // omitted when None
termination: TerminationReason,
},
TextDelta { delta: String },
ReasoningDelta { delta: String },
ReasoningEncryptedValue { encrypted_value: String },
ToolCallStart { id: String, name: String },
ToolCallDelta { id: String, args_delta: String },
ToolCallReady {
id: String,
name: String,
arguments: Value,
},
ToolCallDone {
id: String,
message_id: String,
result: ToolResult,
outcome: ToolCallOutcome,
},
ToolCallStreamDelta {
id: String,
name: String,
delta: String,
},
ToolCallResumed { target_id: String, result: Value },
MessagesSnapshot { messages: Vec<Value> },
ActivitySnapshot {
message_id: String,
activity_type: String,
content: Value,
replace: Option<bool>, // omitted when None
},
ActivityDelta {
message_id: String,
activity_type: String,
patch: Vec<Value>,
},
StepStart { message_id: String },
StepEnd,
InferenceComplete {
model: String,
usage: Option<TokenUsage>, // omitted when None
duration_ms: u64,
},
StateSnapshot { snapshot: Value },
StateDelta { delta: Vec<Value> },
Error {
message: String,
code: Option<String>, // omitted when None
},
}
Crate path: awaken::contract::event::AgentEvent
Helper
impl AgentEvent {
/// Extract the response text from a RunFinish result value.
pub fn extract_response(result: &Option<Value>) -> String
}
StreamEvent
Wire-format envelope that wraps an AgentEvent with sequencing metadata.
Sent over SSE as JSON.
pub struct StreamEvent {
/// Monotonically increasing sequence number within a run.
pub seq: u64,
/// ISO 8601 timestamp.
pub timestamp: String,
/// The wrapped agent event (flattened via #[serde(flatten)]).
pub event: AgentEvent,
}
Constructor
fn new(seq: u64, timestamp: impl Into<String>, event: AgentEvent) -> Self
RunInput
Input to start or resume a run.
#[serde(tag = "type", rename_all = "snake_case")]
pub enum RunInput {
/// A new user message to process.
UserMessage { text: String },
/// Resume a suspended run with a decision.
ResumeDecision {
tool_call_id: String,
action: ResumeDecisionAction,
payload: Value, // omitted when null
},
}
RunOutput
Type alias for the event stream returned by a run:
pub type RunOutput = futures::stream::BoxStream<'static, AgentEvent>;
TerminationReason
Why a run terminated. Serialized as { "type": "...", "value": ... }.
#![allow(unused)]
fn main() {
pub struct StoppedReason { pub code: String, pub detail: Option<String> }
pub enum TerminationReason {
NaturalEnd,
BehaviorRequested,
Stopped(StoppedReason),
Cancelled,
Blocked(String),
Suspended,
Error(String),
}
}
ToolCallOutcome
#![allow(unused)]
fn main() {
pub enum ToolCallOutcome {
Succeeded,
Failed,
Suspended,
}
}
TokenUsage
#![allow(unused)]
fn main() {
pub struct TokenUsage {
pub prompt_tokens: Option<i32>,
pub completion_tokens: Option<i32>,
pub total_tokens: Option<i32>,
pub cache_read_tokens: Option<i32>,
pub cache_creation_tokens: Option<i32>,
pub thinking_tokens: Option<i32>,
}
}
All fields are omitted from JSON when None. TokenUsage::default() produces
all None values.
Related
HTTP API
The awaken-server crate (feature flag server) exposes an HTTP API via Axum.
Most responses are JSON. Streaming endpoints use Server-Sent Events (SSE).
This page mirrors the current route tree in crates/awaken-server/src/routes.rs
and crates/awaken-server/src/config_routes.rs.
Health and metrics
| Method | Path | Description |
|---|---|---|
GET | /health | Readiness probe. Checks store connectivity and returns 200 or 503 |
GET | /health/live | Liveness probe. Always returns 200 OK |
GET | /metrics | Prometheus scrape endpoint |
Threads
| Method | Path | Description |
|---|---|---|
GET | /v1/threads | List thread IDs |
POST | /v1/threads | Create a thread. Body: { "title": "..." } |
GET | /v1/threads/summaries | List thread summaries |
GET | /v1/threads/:id | Get a thread by ID |
PATCH | /v1/threads/:id | Update thread metadata |
DELETE | /v1/threads/:id | Delete a thread |
POST | /v1/threads/:id/cancel | Cancel a specific queued or running job addressed by this thread ID. Returns cancel_requested. |
POST | /v1/threads/:id/decision | Submit a HITL decision for a waiting run on this thread |
POST | /v1/threads/:id/interrupt | Interrupt the thread: bumps the thread generation, supersedes all pending queued jobs, and cancels the active run. Returns interrupt_requested with superseded_jobs count. Unlike /cancel, this performs a clean-slate interrupt via mailbox.interrupt(). |
PATCH | /v1/threads/:id/metadata | Alias for thread metadata updates |
GET | /v1/threads/:id/messages | List thread messages |
POST | /v1/threads/:id/messages | Submit messages as a background run on this thread |
POST | /v1/threads/:id/mailbox | Push a message payload to the thread mailbox |
GET | /v1/threads/:id/mailbox | List mailbox jobs for the thread |
GET | /v1/threads/:id/runs | List runs for the thread |
GET | /v1/threads/:id/runs/latest | Get the latest run for the thread |
Runs
| Method | Path | Description |
|---|---|---|
GET | /v1/runs | List runs |
POST | /v1/runs | Start a run and stream events over SSE |
GET | /v1/runs/:id | Get a run record |
POST | /v1/runs/:id/inputs | Submit follow-up input messages as a background run on the same thread |
POST | /v1/runs/:id/cancel | Cancel a run by run ID |
POST | /v1/runs/:id/decision | Submit a HITL decision by run ID |
Config and capabilities
These endpoints are exposed by config_routes(). Read and schema routes require
AppState to be constructed with a config store. Mutation routes additionally
require a config runtime manager so writes can validate and publish a new
registry snapshot. Without the required config wiring, the routes return 400
with config management API not enabled.
| Method | Path | Description |
|---|---|---|
GET | /v1/capabilities | List registered agents, tools, plugins, models, providers, and config namespaces |
GET | /v1/config/:namespace | List entries in a config namespace |
POST | /v1/config/:namespace | Create an entry; the body must contain "id" |
GET | /v1/config/:namespace/:id | Get one config entry |
PUT | /v1/config/:namespace/:id | Replace a config entry |
DELETE | /v1/config/:namespace/:id | Delete a config entry |
GET | /v1/config/:namespace/$schema | Return the JSON Schema for a namespace |
GET | /v1/agents | Convenience alias for /v1/config/agents |
GET | /v1/agents/:id | Convenience alias for /v1/config/agents/:id |
Current built-in namespaces:
agentsmodelsprovidersmcp-servers
AI SDK v6 routes
| Method | Path | Description |
|---|---|---|
POST | /v1/ai-sdk/chat | Start a chat run and stream protocol-encoded events |
POST | /v1/ai-sdk/threads/:thread_id/runs | Start a thread-scoped AI SDK run |
POST | /v1/ai-sdk/agents/:agent_id/runs | Start an agent-scoped AI SDK run |
GET | /v1/ai-sdk/chat/:thread_id/stream | Resume an SSE stream by thread ID |
GET | /v1/ai-sdk/threads/:thread_id/stream | Alias for stream resume by thread ID |
GET | /v1/ai-sdk/threads/:thread_id/messages | List thread messages |
POST | /v1/ai-sdk/threads/:thread_id/cancel | Cancel the active or queued run on a thread |
POST | /v1/ai-sdk/threads/:thread_id/interrupt | Interrupt a thread (bump generation, supersede pending jobs, cancel active run) |
AG-UI routes
| Method | Path | Description |
|---|---|---|
POST | /v1/ag-ui/run | Start an AG-UI run and stream AG-UI events |
POST | /v1/ag-ui/threads/:thread_id/runs | Start a thread-scoped AG-UI run |
POST | /v1/ag-ui/agents/:agent_id/runs | Start an agent-scoped AG-UI run |
POST | /v1/ag-ui/threads/:thread_id/interrupt | Interrupt a thread |
GET | /v1/ag-ui/threads/:id/messages | List thread messages |
A2A routes
| Method | Path | Description |
|---|---|---|
GET | /.well-known/agent-card.json | Get the public/default agent card |
POST | /v1/a2a/message:send | Send a message to the public/default A2A agent |
POST | /v1/a2a/message:stream | Streaming send over SSE |
GET | /v1/a2a/tasks | List A2A tasks |
GET | /v1/a2a/tasks/:task_id | Get task status |
POST | /v1/a2a/tasks/:task_id:cancel | Cancel a task |
POST | /v1/a2a/tasks/:task_id:subscribe | Subscribe to task updates over SSE |
POST | /v1/a2a/tasks/:task_id/pushNotificationConfigs | Create a push notification config |
GET | /v1/a2a/tasks/:task_id/pushNotificationConfigs | List push notification configs |
GET | /v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_id | Get a push notification config |
DELETE | /v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_id | Delete a push notification config |
GET | /v1/a2a/extendedAgentCard | Get the extended agent card; returns 501 unless enabled |
POST | /v1/a2a/:tenant/message:send | Send a message to a tenant-scoped agent |
POST | /v1/a2a/:tenant/message:stream | Tenant-scoped streaming send |
GET | /v1/a2a/:tenant/tasks | List tasks for a tenant-scoped agent |
GET | /v1/a2a/:tenant/tasks/:task_id | Get tenant-scoped task status |
POST | /v1/a2a/:tenant/tasks/:task_id:cancel | Cancel a tenant-scoped task |
POST | /v1/a2a/:tenant/tasks/:task_id:subscribe | Subscribe to tenant-scoped task updates |
GET | /v1/a2a/:tenant/extendedAgentCard | Get the tenant-scoped extended agent card |
MCP HTTP routes
| Method | Path | Description |
|---|---|---|
POST | /v1/mcp | MCP JSON-RPC request/response endpoint |
GET | /v1/mcp | Reserved for MCP server-initiated SSE; currently returns 405 |
Common query parameters
offset— number of items to skiplimit— maximum items to return, clamped to1..=200cursor— message-history pagination cursor; when provided it takes precedence overoffset, and history responses returnnext_cursorstatus— run filter:running,waiting, ordonevisibility— message filter: omit for external-only, set toallto include internal messages
Error format
Most route groups return:
{ "error": "human-readable message" }
MCP routes return JSON-RPC error objects instead of the generic shape above.
Related
Config
AgentSpec
The serializable agent definition. Can be loaded from JSON/YAML or constructed programmatically via builder methods.
pub struct AgentSpec {
pub id: String,
pub model: String,
pub system_prompt: String,
pub max_rounds: usize, // default: 16
pub max_continuation_retries: usize, // default: 2
pub context_policy: Option<ContextWindowPolicy>,
pub plugin_ids: Vec<String>,
pub active_hook_filter: HashSet<String>,
pub allowed_tools: Option<Vec<String>>,
pub excluded_tools: Option<Vec<String>>,
pub endpoint: Option<RemoteEndpoint>,
pub delegates: Vec<String>,
pub sections: HashMap<String, Value>,
pub registry: Option<String>,
}
Crate path: awaken::registry_spec::AgentSpec (re-exported at awaken::AgentSpec)
Builder methods
AgentSpec::new(id) -> Self
.with_model(model) -> Self
.with_system_prompt(prompt) -> Self
.with_max_rounds(n) -> Self
.with_hook_filter(plugin_id) -> Self
.with_config::<K>(config) -> Result<Self, StateError>
.with_delegate(agent_id) -> Self
.with_endpoint(endpoint) -> Self
.with_section(key, value: Value) -> Self
Typed config access
/// Read a typed plugin config section. Returns default if missing.
fn config<K: PluginConfigKey>(&self) -> Result<K::Config, StateError>
/// Set a typed plugin config section.
fn set_config<K: PluginConfigKey>(&mut self, config: K::Config) -> Result<(), StateError>
ContextWindowPolicy
Controls context window management and auto-compaction.
#![allow(unused)]
fn main() {
#[derive(Default)] pub enum ContextCompactionMode { #[default] KeepRecentRawSuffix, CompactToSafeFrontier }
pub struct ContextWindowPolicy {
pub max_context_tokens: usize, // default: 200_000
pub max_output_tokens: usize, // default: 16_384
pub min_recent_messages: usize, // default: 10
pub enable_prompt_cache: bool, // default: true
pub autocompact_threshold: Option<usize>, // default: None
pub compaction_mode: ContextCompactionMode, // default: KeepRecentRawSuffix
pub compaction_raw_suffix_messages: usize, // default: 2
}
}
ContextCompactionMode
#![allow(unused)]
fn main() {
pub enum ContextCompactionMode {
KeepRecentRawSuffix, // Keep N recent messages raw, compact the rest
CompactToSafeFrontier, // Compact everything up to safe frontier
}
}
InferenceOverride
Per-inference parameter override. All fields are Option; None means “use
agent-level default”. Multiple plugins can emit overrides; fields merge with
last-wins semantics.
#![allow(unused)]
fn main() {
pub enum ReasoningEffort { None, Low, Medium, High, Max, Budget(u32) }
pub struct InferenceOverride {
pub model: Option<String>,
pub fallback_models: Option<Vec<String>>,
pub temperature: Option<f64>,
pub max_tokens: Option<u32>,
pub top_p: Option<f64>,
pub reasoning_effort: Option<ReasoningEffort>,
}
}
Methods
fn is_empty(&self) -> bool
fn merge(&mut self, other: InferenceOverride)
ReasoningEffort
#![allow(unused)]
fn main() {
pub enum ReasoningEffort {
None,
Low,
Medium,
High,
Max,
Budget(u32),
}
}
PluginConfigKey trait
Binds a string key to a typed configuration struct at compile time.
pub trait PluginConfigKey: 'static + Send + Sync {
const KEY: &'static str;
type Config: Default + Clone + Serialize + DeserializeOwned
+ schemars::JsonSchema + Send + Sync + 'static;
}
Implementations register typed sections in AgentSpec::sections. Plugins read
their configuration via agent_spec.config::<MyConfigKey>().
RemoteEndpoint
Configuration for agents running on external backends. Today Awaken ships the
"a2a" backend; backend-specific settings live under options.
pub struct RemoteEndpoint {
pub backend: String,
pub base_url: String,
pub auth: Option<RemoteAuth>,
pub target: Option<String>,
pub timeout_ms: u64, // default: 300_000
pub options: BTreeMap<String, Value>,
}
pub struct RemoteAuth {
pub r#type: String,
// backend-specific auth fields, e.g. { "token": "..." } for bearer
}
ServerConfig
HTTP server configuration. Used when the server feature is enabled.
pub struct ServerConfig {
pub address: String, // default: "0.0.0.0:3000"
pub sse_buffer_size: usize, // default: 64
pub replay_buffer_capacity: usize, // default: 1024
pub shutdown: ShutdownConfig,
pub max_concurrent_requests: usize, // default: 100
pub a2a_extended_card_bearer_token: Option<String>,
}
pub struct ShutdownConfig {
pub timeout_secs: u64, // default: 30
}
Crate path: awaken_server::app::ServerConfig
| Field | Type | Default | Description |
|---|---|---|---|
address | String | "0.0.0.0:3000" | Socket address the server binds to |
sse_buffer_size | usize | 64 | Maximum SSE channel buffer size per connection |
replay_buffer_capacity | usize | 1024 | Maximum SSE frames buffered per run for reconnection replay |
max_concurrent_requests | usize | 100 | Maximum in-flight requests; excess requests receive 503 |
a2a_extended_card_bearer_token | Option<String> | None | Enables authenticated GET /v1/a2a/extendedAgentCard when set |
shutdown.timeout_secs | u64 | 30 | Seconds to wait for in-flight requests to drain before force-exiting |
MailboxConfig
Configuration for the persistent run queue (mailbox). Controls lease timing, sweep/GC intervals, and retry behavior for failed jobs.
pub struct MailboxConfig {
pub lease_ms: u64, // default: 30_000
pub suspended_lease_ms: u64, // default: 600_000
pub lease_renewal_interval: Duration, // default: 10s
pub sweep_interval: Duration, // default: 30s
pub gc_interval: Duration, // default: 60s
pub gc_ttl: Duration, // default: 24h
pub default_max_attempts: u32, // default: 5
pub default_retry_delay_ms: u64, // default: 250
pub max_retry_delay_ms: u64, // default: 30_000
}
Crate path: awaken_server::mailbox::MailboxConfig
| Field | Type | Default | Description |
|---|---|---|---|
lease_ms | u64 | 30_000 | Lease duration in milliseconds for active runs |
suspended_lease_ms | u64 | 600_000 | Lease duration in milliseconds for suspended runs awaiting human input |
lease_renewal_interval | Duration | 10s | How often the worker renews its lease on a running job |
sweep_interval | Duration | 30s | How often to scan for expired leases and reclaim orphaned jobs |
gc_interval | Duration | 60s | How often to run garbage collection for terminal (completed/failed) jobs |
gc_ttl | Duration | 24h | How long terminal jobs are retained before purging |
default_max_attempts | u32 | 5 | Maximum delivery attempts before a job is dead-lettered |
default_retry_delay_ms | u64 | 250 | Base retry delay in milliseconds between attempts |
max_retry_delay_ms | u64 | 30_000 | Maximum retry delay in milliseconds for exponential backoff |
LlmRetryPolicy
Policy for retrying failed LLM inference calls with exponential backoff and
optional model fallback. Can be set per-agent via the "retry" section in
AgentSpec.
pub struct LlmRetryPolicy {
pub max_retries: u32, // default: 2
pub fallback_models: Vec<String>, // default: []
pub backoff_base_ms: u64, // default: 500
}
Crate path: awaken_runtime::engine::retry::LlmRetryPolicy
| Field | Type | Default | Description |
|---|---|---|---|
max_retries | u32 | 2 | Maximum retry attempts after the initial call (0 = no retry) |
fallback_models | Vec<String> | [] | Model names to try in order after the primary model exhausts retries |
backoff_base_ms | u64 | 500 | Base delay in milliseconds for exponential backoff; actual delay = min(base * 2^attempt, 8000ms). Set to 0 to disable backoff |
AgentSpec integration
Register via the RetryConfigKey plugin config key ("retry" section):
use awaken_runtime::engine::retry::RetryConfigKey;
let spec = AgentSpec::new("my-agent")
.with_config::<RetryConfigKey>(LlmRetryPolicy {
max_retries: 3,
fallback_models: vec!["claude-sonnet-4-20250514".into()],
backoff_base_ms: 1000,
})?;
CircuitBreakerConfig
Per-model circuit breaker configuration. Prevents cascading failures by short-circuiting requests to models that have experienced repeated consecutive failures. After a cooldown the circuit transitions to half-open, allowing limited probe requests before fully closing on success.
pub struct CircuitBreakerConfig {
pub failure_threshold: u32, // default: 5
pub cooldown: Duration, // default: 30s
pub half_open_max: u32, // default: 1
}
Crate path: awaken_runtime::engine::circuit_breaker::CircuitBreakerConfig
| Field | Type | Default | Description |
|---|---|---|---|
failure_threshold | u32 | 5 | Consecutive failures before the circuit opens and rejects requests |
cooldown | Duration | 30s | How long the circuit stays open before transitioning to half-open |
half_open_max | u32 | 1 | Maximum probe requests allowed in the half-open state before the circuit reopens on failure or closes on success |
Feature flags and their effects
| Flag | Runtime behavior |
|---|---|
permission | Registers the permission plugin; tools can be gated with HITL approval |
observability | Registers the observability plugin; emits traces and metrics |
mcp | Enables MCP tool bridge; tools from MCP servers are auto-registered |
skills | Enables the skills subsystem for reusable agent capabilities |
server | Builds the HTTP server with SSE streaming and protocol adapters |
generative-ui | Enables generative UI component streaming to frontends |
Custom plugin configuration
Plugins declare typed configuration sections using the PluginConfigKey trait,
which binds a string key to a Rust struct at compile time:
pub trait PluginConfigKey: 'static + Send + Sync {
const KEY: &'static str; // section name in AgentSpec.sections
type Config: Default + Clone + Serialize + DeserializeOwned
+ schemars::JsonSchema + Send + Sync + 'static;
}
Declaring schemas for validation
Plugins override config_schemas() to return JSON Schemas generated from
their config structs. The resolve pipeline (Stage 2) validates every
AgentSpec.sections entry against these schemas before any hook runs.
fn config_schemas(&self) -> Vec<ConfigSchema> {
vec![ConfigSchema {
key: RateLimitConfigKey::KEY,
json_schema: schemars::schema_for!(RateLimitConfig),
}]
}
Reading config at runtime
Plugins read their typed config via agent_spec.config::<K>(). If the section
is absent, the Default impl is returned.
let cfg = ctx.agent_spec().config::<RateLimitConfigKey>()?;
Worked example
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
use awaken::PluginConfigKey;
#[derive(Debug, Clone, Default, Serialize, Deserialize, JsonSchema)]
pub struct RateLimitConfig {
pub max_calls_per_step: u32, // default: 0 (unlimited)
pub cooldown_ms: u64, // default: 0
}
pub struct RateLimitConfigKey;
impl PluginConfigKey for RateLimitConfigKey {
const KEY: &'static str = "rate_limit";
type Config = RateLimitConfig;
}
// In plugin register():
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_hook(Phase::BeforeToolExecute, RateLimitHook);
Ok(())
}
fn config_schemas(&self) -> Vec<ConfigSchema> {
vec![ConfigSchema {
key: RateLimitConfigKey::KEY,
json_schema: schemars::schema_for!(RateLimitConfig),
}]
}
// In a hook:
let cfg = ctx.agent_spec().config::<RateLimitConfigKey>()?;
if cfg.max_calls_per_step > 0 { /* enforce limit */ }
Validation behavior
- Section present but invalid: resolve fails with a schema validation error.
- Section present but unclaimed: a warning is logged (possible typo or removed plugin).
- Section absent: allowed; the plugin receives
Config::default().
Related
Errors
All error types use thiserror derives and implement std::error::Error +
Display.
StateError
Errors from state management operations. Defined in awaken-contract.
#![allow(unused)]
fn main() {
use awaken::Phase;
pub enum StateError {
RevisionConflict { expected: u64, actual: u64 },
MutationBaseRevisionMismatch { left: u64, right: u64 },
PluginAlreadyInstalled { name: String },
PluginNotInstalled { type_name: &'static str },
KeyAlreadyRegistered { key: String },
UnknownKey { key: String },
KeyDecode { key: String, message: String },
KeyEncode { key: String, message: String },
HandlerAlreadyRegistered { key: String },
EffectHandlerAlreadyRegistered { key: String },
PhaseRunLoopExceeded { phase: Phase, max_rounds: usize },
UnknownScheduledActionHandler { key: String },
UnknownEffectHandler { key: String },
ParallelMergeConflict { key: String },
ToolAlreadyRegistered { tool_id: String },
Cancelled,
}
}
Crate path: awaken::StateError
StateError implements Clone and PartialEq.
ToolError
Errors returned from Tool::validate_args or Tool::execute. A ToolError
aborts the tool call entirely (as opposed to ToolResult::error, which sends
the failure back to the LLM).
#![allow(unused)]
fn main() {
pub enum ToolError {
InvalidArguments(String),
ExecutionFailed(String),
Denied(String),
NotFound(String),
Internal(String),
}
}
Crate path: awaken::contract::tool::ToolError
BuildError
Errors from AgentRuntimeBuilder::build().
#![allow(unused)]
fn main() {
use awaken::StateError;
struct DiscoveryError;
pub enum BuildError {
State(StateError),
AgentRegistryConflict(String),
ToolRegistryConflict(String),
ModelRegistryConflict(String),
ProviderRegistryConflict(String),
PluginRegistryConflict(String),
ValidationFailed(String),
DiscoveryFailed(DiscoveryError), // requires feature "a2a"
}
}
Crate path: awaken::BuildError
BuildError converts from StateError via From.
RuntimeError
Errors from agent runtime operations (resolving agents, starting runs).
#![allow(unused)]
fn main() {
use awaken::StateError;
pub enum RuntimeError {
State(StateError),
ThreadAlreadyRunning { thread_id: String },
AgentNotFound { agent_id: String },
ResolveFailed { message: String },
}
}
Crate path: awaken::RuntimeError
RuntimeError converts from StateError via From. Implements Clone and
PartialEq.
InferenceExecutionError
Errors from the LLM execution layer.
#![allow(unused)]
fn main() {
pub enum InferenceExecutionError {
Provider(String),
RateLimited(String),
Timeout(String),
Cancelled,
}
}
Crate path: awaken::contract::executor::InferenceExecutionError
StorageError
Errors returned by ThreadStore, RunStore, and ThreadRunStore operations.
#![allow(unused)]
fn main() {
pub enum StorageError {
NotFound(String),
AlreadyExists(String),
VersionConflict { expected: u64, actual: u64 },
Io(String),
Serialization(String),
}
}
Crate path: awaken::contract::storage::StorageError
ResolveError
Errors from the agent resolution pipeline (resolving AgentSpec to a runnable
ResolvedAgent).
#![allow(unused)]
fn main() {
use awaken::StateError;
pub enum ResolveError {
AgentNotFound(String),
ModelNotFound(String),
ProviderNotFound(String),
PluginNotFound(String),
InvalidPluginConfig { plugin: String, key: String, message: String },
RemoteAgentNotDirectlyRunnable(String),
ToolIdConflict { tool_id: String, source_a: String, source_b: String },
EnvBuild(StateError),
}
}
Crate path: awaken::registry::resolve::ResolveError
UnknownKeyPolicy
Controls behavior when encountering an unknown state key during deserialization.
#![allow(unused)]
fn main() {
pub enum UnknownKeyPolicy {
Error,
Skip,
}
}
Crate path: awaken::UnknownKeyPolicy
Related
AI SDK v6 Protocol
The AI SDK v6 adapter translates Awaken’s internal AgentEvent stream into the Vercel AI SDK v6 UI Message Stream format. This allows any AI SDK-compatible frontend (useChat, useAssistant) to consume agent output without custom parsing.
Endpoint
POST /v1/ai-sdk/chat
Request Body
{
"messages": [{ "role": "user", "content": "Hello" }],
"threadId": "optional-thread-id",
"agentId": "optional-agent-id"
}
| Field | Type | Required | Description |
|---|---|---|---|
messages | AiSdkMessage[] | yes | Chat messages. Content may be a string or an array of content parts. |
threadId | string | no | Existing thread to continue. Omit to create a new thread. |
agentId | string | no | Target agent. Uses the default agent when omitted. |
Response
SSE stream (text/event-stream). Each line is a JSON-encoded UIStreamEvent.
Auxiliary Routes
| Route | Method | Description |
|---|---|---|
/v1/ai-sdk/streams/:run_id | GET | Reconnect to an active run’s SSE stream. |
/v1/ai-sdk/runs/:run_id/stream | GET | Alias for stream reconnect. |
/v1/ai-sdk/threads/:id/messages | GET | Retrieve thread message history. |
Event Mapping
The AiSdkEncoder is a stateful transcoder that converts AgentEvent variants into UIStreamEvent variants. It tracks open text blocks and reasoning blocks across tool-call boundaries.
| AgentEvent | UIStreamEvent(s) |
|---|---|
RunStart | MessageStart + Data("run-info", ...) |
TextDelta | TextStart (if block not open) + TextDelta |
ReasoningDelta | ReasoningStart (if block not open) + ReasoningDelta |
ReasoningEncryptedValue | ReasoningStart (if not open) + ReasoningDelta |
ToolCallStart | Close open text/reasoning blocks, then ToolCallStart |
ToolCallDelta | ToolCallDelta |
ToolCallDone | ToolCallEnd |
StepStart | (no direct mapping) |
StepEnd | (no direct mapping) |
InferenceComplete | Data("inference-complete", ...) |
MessagesSnapshot | Data("messages-snapshot", ...) |
StateSnapshot | Data("state-snapshot", ...) |
StateDelta | Data("state-delta", ...) |
ActivitySnapshot | Data("activity-snapshot", ...) |
ActivityDelta | Data("activity-delta", ...) |
RunFinish | Close open blocks, Data("finish", ...), Finish |
UIStreamEvent Types
The wire format uses the type field as a discriminant, serialized in kebab-case:
start– message lifecycle start, carries optionalmessageIdandmessageMetadatatext-start,text-delta,text-end– text block lifecycle with content IDreasoning-start,reasoning-delta,reasoning-end– reasoning block lifecycletool-call-start,tool-call-delta,tool-call-end– tool call lifecycledata– arbitrary named data events (state snapshots, activity, inference metadata)finish– terminal event with finish reason and usage summary
Text Block Lifecycle
The encoder automatically manages text block open/close boundaries:
- First
TextDeltaopens a text block (TextStart). - Subsequent deltas append to the open block.
- When a
ToolCallStartarrives, the encoder closes any open text or reasoning block before emitting tool events. - After tool execution completes, new text deltas open a fresh block with an incremented ID.
This ensures the frontend receives well-formed block boundaries even though the runtime emits flat delta events.
Related
- Events – full
AgentEventenum - HTTP API – server configuration
- Integrate AI SDK Frontend – frontend integration guide
AG-UI Protocol
The AG-UI adapter translates Awaken’s AgentEvent stream into the AG-UI (CopilotKit) event format. This enables CopilotKit frontends to drive Awaken agents with no custom adapter code.
Endpoint
POST /v1/ag-ui/run
Request Body
{
"threadId": "optional-thread-id",
"agentId": "optional-agent-id",
"messages": [{ "role": "user", "content": "Hello" }],
"context": {}
}
| Field | Type | Required | Description |
|---|---|---|---|
messages | AgUiMessage[] | yes | Chat messages with role and content strings. |
threadId | string | no | Existing thread. Omit to create a new thread. |
agentId | string | no | Target agent. Uses the default agent when omitted. |
context | object | no | CopilotKit context forwarding (reserved). |
Response
SSE stream (text/event-stream). Each frame is a JSON-encoded AG-UI Event.
Auxiliary Routes
| Route | Method | Description |
|---|---|---|
/v1/ag-ui/threads/:id/messages | GET | Retrieve thread message history. |
Event Mapping
The AgUiEncoder is a stateful transcoder that manages text message and step lifecycles.
| AgentEvent | AG-UI Event(s) |
|---|---|
RunStart | RUN_STARTED |
TextDelta | TEXT_MESSAGE_START (if not open) + TEXT_MESSAGE_CONTENT |
ReasoningDelta | REASONING_MESSAGE_START (if not open) + REASONING_MESSAGE_CONTENT |
ToolCallStart | Close text/reasoning, STEP_STARTED, TOOL_CALL_START |
ToolCallDelta | TOOL_CALL_ARGS |
ToolCallDone | TOOL_CALL_END, STEP_FINISHED |
StateSnapshot | STATE_SNAPSHOT |
StateDelta | STATE_DELTA |
RunFinish (success) | Close text/reasoning, RUN_FINISHED |
RunFinish (error) | Close text/reasoning, RUN_ERROR |
AG-UI Event Types
Events use an uppercase type discriminant:
RUN_STARTED/RUN_FINISHED/RUN_ERROR– run lifecycleTEXT_MESSAGE_START/TEXT_MESSAGE_CONTENT/TEXT_MESSAGE_END– assistant textREASONING_MESSAGE_START/REASONING_MESSAGE_CONTENT/REASONING_MESSAGE_END– reasoning traceSTEP_STARTED/STEP_FINISHED– step boundaries (wrapping tool calls)TOOL_CALL_START/TOOL_CALL_ARGS/TOOL_CALL_END– tool call lifecycleSTATE_SNAPSHOT/STATE_DELTA– shared state synchronizationMESSAGES_SNAPSHOT– full thread message history
All events carry a BaseEvent with optional timestamp and rawEvent fields.
Roles
AG-UI messages use lowercase role strings: system, user, assistant, tool.
Text Message Lifecycle
- First
TextDeltaemitsTEXT_MESSAGE_STARTfollowed byTEXT_MESSAGE_CONTENT. - Subsequent deltas emit only
TEXT_MESSAGE_CONTENT. - A
ToolCallStartorRunFinishcloses the open message withTEXT_MESSAGE_END.
Reasoning messages follow the same pattern with REASONING_MESSAGE_* events.
Related
- Events – full
AgentEventenum - Integrate CopilotKit (AG-UI) – frontend integration guide
A2A Protocol
The Agent-to-Agent (A2A) adapter implements the A2A protocol for remote agent discovery, task delegation, and inter-agent communication.
Feature gate: server
Endpoints
| Route | Method | Description |
|---|---|---|
/.well-known/agent-card.json | GET | Discovery endpoint for the public/default agent card. |
/v1/a2a/message:send | POST | Send a message to the public/default A2A agent. Returns a task wrapper. |
/v1/a2a/message:stream | POST | Streaming send over SSE. |
/v1/a2a/tasks | GET | List A2A tasks. |
/v1/a2a/tasks/:task_id | GET | Poll task status by ID. |
/v1/a2a/tasks/:task_id:cancel | POST | Cancel a running task. |
/v1/a2a/tasks/:task_id:subscribe | POST | Subscribe to task updates over SSE. |
/v1/a2a/tasks/:task_id/pushNotificationConfigs | POST | Create a push notification config. |
/v1/a2a/tasks/:task_id/pushNotificationConfigs | GET | List push notification configs. |
/v1/a2a/tasks/:task_id/pushNotificationConfigs/:config_id | GET / DELETE | Read or delete a push notification config. |
/v1/a2a/extendedAgentCard | GET | Extended agent card. Returns 501 unless capabilities.extendedAgentCard=true. |
Tenant-scoped variants mirror the same interface under /v1/a2a/:tenant/..., for example /v1/a2a/research/message:send and /v1/a2a/research/tasks/:task_id.
Agent Card
The discovery endpoint returns an AgentCard describing the exposed interface and capabilities:
{
"name": "My Agent",
"description": "A helpful assistant",
"supportedInterfaces": [
{
"url": "https://example.com/v1/a2a",
"protocolBinding": "HTTP+JSON",
"protocolVersion": "1.0"
}
],
"version": "1.0.0",
"capabilities": {
"streaming": true,
"pushNotifications": true,
"stateTransitionHistory": false,
"extendedAgentCard": false
},
"defaultInputModes": ["text/plain"],
"defaultOutputModes": ["text/plain"],
"skills": [
{
"id": "general",
"name": "General Q&A",
"description": "Answer general questions",
"tags": ["qa"],
"inputModes": ["text/plain"],
"outputModes": ["text/plain"]
}
]
}
Agent cards are derived from registered AgentSpec entries. The top-level legacy url/id fields are not emitted.
Message Send
{
"message": {
"taskId": "optional-client-provided-id",
"contextId": "optional-client-provided-id",
"messageId": "msg-123",
"role": "ROLE_USER",
"parts": [{ "text": "Summarize this document" }]
},
"configuration": {
"returnImmediately": true
}
}
The server maps A2A tasks to Awaken thread/mailbox execution. The response uses the v1 task wrapper shape:
{
"task": {
"id": "optional-client-provided-id",
"contextId": "optional-client-provided-id",
"status": {
"state": "TASK_STATE_SUBMITTED"
}
}
}
If returnImmediately is omitted or false, the adapter waits for a terminal/interrupted task state before responding.
Task Status
GET /v1/a2a/tasks/:task_id returns a Task resource:
{
"id": "abc-123",
"contextId": "abc-123",
"status": {
"state": "TASK_STATE_COMPLETED",
"message": {
"messageId": "msg-response",
"role": "ROLE_AGENT",
"parts": [{ "text": "..." }]
}
},
"history": []
}
Task states follow the v1 enum names such as TASK_STATE_SUBMITTED, TASK_STATE_WORKING, TASK_STATE_COMPLETED, TASK_STATE_FAILED, and TASK_STATE_CANCELED.
Optional capability defaults
Awaken currently enables these A2A capabilities by default:
streaming = truepushNotifications = true
extendedAgentCard remains opt-in and is enabled only when ServerConfig.a2a_extended_card_bearer_token is configured. When disabled, the extended card endpoints return spec-shaped unsupported errors.
Remote Agent Delegation
Awaken agents can delegate to remote A2A agents via AgentTool::remote(). The A2aBackend sends a message:send request to the remote endpoint, reads the returned task.id, then polls /tasks/:task_id for completion. From the LLM’s perspective, this is a regular tool call — the A2A transport is transparent.
Configuration for remote agents is declared in AgentSpec. RemoteEndpoint is generic, and A2A uses backend: "a2a":
{
"id": "remote-researcher",
"endpoint": {
"backend": "a2a",
"base_url": "https://remote-agent.example.com/v1/a2a",
"auth": { "type": "bearer", "token": "..." },
"target": "researcher",
"timeout_ms": 300000,
"options": {
"poll_interval_ms": 1000
}
}
}
Agents with an endpoint field are resolved as remote backend agents. Today the built-in delegate backend is A2A. Agents without endpoint run locally.
Related
- Multi-Agent Patterns — delegation and handoff design
- A2A Specification — official protocol reference
Cancellation
Cooperative cancellation token used to interrupt agent runs, streaming loops, and long-running operations.
CancellationToken
A cloneable handle backed by a shared AtomicBool and a tokio::sync::Notify.
All clones share the same cancellation state – cancelling any clone cancels all of them.
use awaken::CancellationToken;
let token = CancellationToken::new();
Crate path: awaken_contract::cancellation::CancellationToken
Methods
/// Create a new uncancelled token.
pub fn new() -> Self
/// Signal cancellation. Wakes all async waiters immediately.
pub fn cancel(&self)
/// Check if cancellation has been requested (synchronous poll).
pub fn is_cancelled(&self) -> bool
/// Wait until cancellation is signalled. Resolves immediately if already cancelled.
pub async fn cancelled(&self)
Traits
Clone– clones share the same underlying stateDefault– creates an uncancelled token (equivalent tonew())
Synchronous polling
Use is_cancelled() to check cancellation from synchronous code or tight loops:
let token = CancellationToken::new();
while !token.is_cancelled() {
// do work
}
Async waiting with tokio::select!
Use cancelled() in a tokio::select! branch to interrupt async operations
without polling:
let token = CancellationToken::new();
tokio::select! {
result = some_async_work() => {
// work completed before cancellation
}
_ = token.cancelled() => {
// cancellation was signalled
}
}
The cancelled() future resolves immediately if the token is already cancelled,
so there is no race between checking and waiting.
Cooperative semantics
Cancellation is cooperative. Calling cancel() sets a flag and wakes async
waiters, but does not abort any running task. Code must check is_cancelled()
or select! on cancelled() to observe and respond to cancellation.
Key properties:
- Idempotent – calling
cancel()multiple times is safe and has no additional effect. - Shared – all clones observe the same cancellation state. Cancelling from any clone is visible to all others.
- Ordering – uses
SeqCstordering on the atomic flag, so cancellation is immediately visible across threads. - Immediate wake –
cancel()callsNotify::notify_waiters(), waking all tasks blocked oncancelled()without waiting for the next poll.
Runtime usage
The runtime passes a CancellationToken into each agent run. It is used to:
- Interrupt streaming inference mid-response when the caller requests cancellation.
- Stop the agent loop between inference cycles.
- Propagate cancellation from external transports (HTTP, SSE) into the runtime.
A typical streaming loop checks cancellation alongside each chunk:
let token = CancellationToken::new();
let clone = token.clone();
tokio::select! {
_ = async {
while let Some(chunk) = stream.next().await {
// process chunk
}
} => {}
_ = clone.cancelled() => {
// stop processing, clean up
}
}
Auto-cancellation on new message
When a new message is submitted to a thread that already has an active run (e.g. a run suspended while waiting for tool approval), the runtime automatically cancels the old run before starting the new one.
This prevents the ThreadAlreadyRunning error and avoids infinite retry
loops. The sequence is:
Mailbox::submit()detects an active run on the thread.- Calls
AgentRuntime::cancel_and_wait_by_thread()which signals theCancellationTokenand waits (up to 5 seconds) for theRunSlotGuardto drop and free the thread slot. - The old run emits
RunFinishwithTerminationReason::Cancelled. - Before the new run starts inference,
strip_unpaired_tool_calls()removes any orphaned tool calls from the message history that were left by the cancelled run (assistant messages withtool_callsthat have no matchingToolrole response).
This ensures clean handoff between runs without leaving dangling state that would confuse the LLM.
Key Files
crates/awaken-runtime/src/cancellation.rs–CancellationTokenimplementationcrates/awaken-runtime/src/runtime/agent_runtime/active_registry.rs– run tracking with completion notificationcrates/awaken-runtime/src/runtime/agent_runtime/runner.rs–strip_unpaired_tool_calls()
Tool Execution Modes
ToolExecutionMode controls how the runtime executes tool calls that the LLM
requests in a single inference step.
Crate path: awaken::contract::executor::ToolExecutionMode
ToolExecutionMode
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
pub enum ToolExecutionMode {
#[default]
Sequential,
ParallelBatchApproval,
ParallelStreaming,
}
The default is Sequential.
Modes
Sequential
Executes tool calls one at a time in the order the LLM returned them.
- The state snapshot is refreshed between calls (
requires_incremental_statereturnstrue), so each tool sees the effects of the previous one. - Stops at the first suspension. If tool call 2 of 4 suspends, tool calls 3 and 4 are not executed. Failures do not stop execution.
- Simplest mode. Use when tool calls have data dependencies or when ordering matters.
// SequentialToolExecutor runs calls in order, stopping at first suspension.
let executor = SequentialToolExecutor;
let results = executor.execute(&tools, &calls, &ctx).await?;
ParallelBatchApproval
Executes all tool calls concurrently. All tools see the same frozen state snapshot.
- Suspension decisions are replayed using
DecisionReplayPolicy::BatchAllSuspended: the runtime waits until every suspended call has a decision before replaying any of them. - Enforces parallel patch conflict checks (
requires_conflict_checkreturnstrue). - Does not stop on suspension or failure; all calls run to completion.
- Use when tool calls are independent and you want an all-or-nothing approval gate for HITL workflows.
let executor = ParallelToolExecutor::batch_approval();
// decision_replay_policy() == DecisionReplayPolicy::BatchAllSuspended
ParallelStreaming
Executes all tool calls concurrently with the same frozen state snapshot.
- Suspension decisions are replayed using
DecisionReplayPolicy::Immediate: each decision is replayed as soon as it arrives, without waiting for the others. - Enforces parallel patch conflict checks.
- Does not stop on suspension or failure; all calls run to completion.
- Use when tool calls are independent and you want the fastest end-to-end completion without batching approval.
let executor = ParallelToolExecutor::streaming();
// decision_replay_policy() == DecisionReplayPolicy::Immediate
Comparison
| Behavior | Sequential | ParallelBatchApproval | ParallelStreaming |
|---|---|---|---|
| Execution order | One at a time | All concurrently | All concurrently |
| State freshness | Refreshed between calls | Frozen snapshot | Frozen snapshot |
| Stops on suspension | Yes (first suspension) | No | No |
| Stops on failure | No | No | No |
| Decision replay | N/A | Batch (all at once) | Immediate (one by one) |
| Conflict checks | No | Yes | Yes |
Executor trait
Both SequentialToolExecutor and ParallelToolExecutor implement the
ToolExecutor trait:
#[async_trait]
pub trait ToolExecutor: Send + Sync {
/// Execute tool calls and return results.
async fn execute(
&self,
tools: &HashMap<String, Arc<dyn Tool>>,
calls: &[ToolCall],
base_ctx: &ToolCallContext,
) -> Result<Vec<ToolExecutionResult>, ToolExecutorError>;
/// Strategy name for logging.
fn name(&self) -> &'static str;
/// Whether the executor needs state refreshed between individual tool calls.
fn requires_incremental_state(&self) -> bool { false }
}
The name() values are "sequential", "parallel_batch_approval", and
"parallel_streaming".
DecisionReplayPolicy
Controls when resume decisions for suspended tool calls are replayed into the execution pipeline. Only relevant for parallel modes.
pub enum DecisionReplayPolicy {
/// Replay each resolved suspended call as soon as its decision arrives.
Immediate,
/// Replay only when all currently suspended calls have decisions.
BatchAllSuspended,
}
Key Files
crates/awaken-contract/src/contract/executor.rs–ToolExecutionModeenumcrates/awaken-runtime/src/execution/executor.rs–SequentialToolExecutor,ParallelToolExecutor,ToolExecutortrait
Related
Architecture
Awaken is organized around one runtime core plus three surrounding surfaces: contract types, server/storage adapters, and optional extensions. The important distinction is not just crate boundaries, but where decisions are made.
Application assembly
register tools / models / providers / plugins / AgentSpec
|
v
AgentRuntime
resolve AgentSpec -> ResolvedAgent
build ExecutionEnv from plugins
run the phase loop
expose cancel / decision control for active runs
|
v
Server + storage surfaces
HTTP routes, mailbox, SSE replay, protocol adapters,
thread/run persistence, profile storage
Contract layer – awaken-contract defines the shared types used everywhere: AgentSpec, ModelSpec, ProviderSpec, Tool, AgentEvent, transport traits, and the typed state model. This is the vocabulary that the rest of the system speaks.
Runtime core – awaken-runtime is the orchestration layer. It resolves agent IDs to fully wired configurations (ResolvedAgent), builds an ExecutionEnv from plugins, manages active runs, and delegates execution to the loop runner plus phase engine.
Server and persistence surfaces – awaken-server turns the runtime into HTTP and SSE endpoints, mailbox-backed background execution, and protocol adapters. awaken-stores provides concrete persistence backends for threads and runs. awaken-ext-* crates extend the runtime at phase and tool boundaries without changing the core loop.
Request Sequence
The following diagram shows a representative request flowing through the system:
sequenceDiagram
participant Client
participant Server
participant Runtime
participant LLM
participant Tool
Client->>Server: POST /v1/ai-sdk/chat
Server->>Runtime: RunRequest (agent_id, thread_id, messages)
Runtime->>Runtime: Resolve agent (AgentSpec -> ResolvedAgent)
Runtime->>Runtime: Load thread history
Runtime->>Runtime: RunStart phase
loop Step loop
Runtime->>Runtime: StepStart phase
Runtime->>Runtime: BeforeInference phase
Runtime->>LLM: Inference request (messages + tools)
LLM-->>Runtime: Response (text + tool_calls)
Runtime->>Runtime: AfterInference phase
opt Tool calls present
Runtime->>Runtime: BeforeToolExecute phase
Runtime->>Tool: execute(args, ctx)
Tool-->>Runtime: ToolResult
Runtime->>Runtime: AfterToolExecute phase
end
Runtime->>Runtime: StepEnd phase (checkpoint)
end
Runtime->>Runtime: RunEnd phase
Runtime-->>Server: AgentEvent stream
Server-->>Client: SSE events (protocol-specific encoding)
Phase-Driven Execution Loop
Every run proceeds through a fixed sequence of phases. Plugins register hooks that run at each phase boundary, giving them control over inference parameters, tool execution, state mutations, and termination logic.
RunStart -> [StepStart -> BeforeInference -> AfterInference
-> BeforeToolExecute -> AfterToolExecute -> StepEnd]* -> RunEnd
The step loop repeats until one of these conditions fires:
- The LLM returns a response with no tool calls (
NaturalEnd). - A plugin or stop condition requests termination (
Stopped,BehaviorRequested). - A tool call suspends waiting for external input (
Suspended). - The run is cancelled externally (
Cancelled). - An error occurs (
Error).
At each phase boundary, the loop checks the cancellation token and the run lifecycle state before proceeding.
Repository Map
awaken
├─ awaken-contract
│ ├─ registry specs
│ ├─ tool / executor / event / transport contracts
│ └─ state model
├─ awaken-runtime
│ ├─ builder + registries + resolve pipeline
│ ├─ AgentRuntime control plane
│ ├─ loop_runner + phase engine
│ ├─ execution / context / policies / profile
│ └─ runtime extensions (handoff, local A2A, background)
├─ awaken-server
│ ├─ routes + config API + mailbox + services
│ ├─ protocols: ai_sdk_v6 / ag_ui / a2a / mcp / acp-stdio
│ └─ transport: SSE relay / replay buffer / transcoder
├─ awaken-stores
└─ awaken-ext-*
Design Intent
Three principles guide the architecture:
Snapshot isolation – Phase hooks never see partially applied state. They read from an immutable snapshot and write to a MutationBatch. The batch is applied atomically after all hooks for a phase have converged. This eliminates data races between concurrent hooks and makes hook execution order irrelevant for correctness.
Append-style persistence – Thread messages are append-only. State is checkpointed at step boundaries. This makes it possible to replay a run from any checkpoint and produces a deterministic audit trail.
Transport independence – The runtime emits AgentEvent values through an EventSink trait. Protocol adapters (AiSdkEncoder, AgUiEncoder) transcode these events into wire formats. The runtime has no knowledge of HTTP, SSE, or any specific protocol. Adding a new protocol means implementing a new encoder – the runtime does not change.
See Also
- Run Lifecycle and Phases – phase execution model
- State and Snapshot Model – snapshot isolation details
- Design Tradeoffs – rationale for key architectural decisions
- Tool and Plugin Boundary – plugin vs tool design
Agent Resolution
When runtime.run(request) is called, the agent_id in the request must be resolved into a fully wired ResolvedAgent – a struct that holds live references to an LLM executor, tools, plugins, and an execution environment. This resolution happens on every resolve() call; nothing is cached or shared between runs. This page describes the three-stage resolution pipeline and the builder that feeds it.
Pipeline Overview
Resolution is a pure function: (RegistrySet, agent_id) -> ResolvedAgent. It proceeds through three sequential stages:
flowchart LR
subgraph Stage1["Stage 1: Lookup"]
A1[Fetch AgentSpec] --> A2[Resolve ModelSpec]
A2 --> A3[Get LlmExecutor]
A3 --> A4{Retry config?}
A4 -- yes --> A5[Wrap in RetryingExecutor]
A4 -- no --> A6[Use executor directly]
end
subgraph Stage2["Stage 2: Plugin Pipeline"]
B1[Resolve plugin IDs] --> B2[Inject default plugins]
B2 --> B3{context_policy set?}
B3 -- yes --> B4[Add CompactionPlugin + ContextTransformPlugin]
B3 -- no --> B5[Skip]
B4 --> B6[Validate config sections]
B5 --> B6
B6 --> B7[Build ExecutionEnv]
end
subgraph Stage3["Stage 3: Tool Pipeline"]
C1[Collect global tools] --> C2[Merge delegate agent tools]
C2 --> C3[Merge plugin-registered tools]
C3 --> C4[Check for ID conflicts]
C4 --> C5[Apply allowed/excluded filters]
end
Stage1 --> Stage2 --> Stage3
Any failure at any stage produces a ResolveError and aborts. The pipeline never returns a partial result.
Stage 1: Lookup
The first stage fetches the raw data from registries:
-
AgentSpec – looked up from
AgentSpecRegistrybyagent_id. If the spec has anendpointfield (remote backend agent), resolution fails withRemoteAgentNotDirectlyRunnable– remote agents can only be used as delegates, not run directly. -
ModelSpec – the spec’s
modelfield (a string ID like"gpt-4") is resolved throughModelRegistryinto aModelSpec, which maps it to a provider ID and an actual API model name (for example, provider"openai", model"gpt-4o"). -
LlmExecutor – the provider ID from the model entry is resolved through
ProviderRegistryto get a liveLlmExecutorinstance. -
Retry decoration – if the agent spec contains a
RetryConfigKeysection withmax_retries > 0or non-emptyfallback_models, the executor is wrapped in aRetryingExecutordecorator.
Stage 2: Plugin Pipeline
The second stage assembles the plugin chain and builds the execution environment.
Plugin resolution
Plugins listed in AgentSpec.plugin_ids are resolved by ID from PluginSource. A missing plugin produces ResolveError::PluginNotFound.
Default plugin injection
After resolving user-declared plugins, the pipeline injects runtime-required default plugins. These are always present regardless of agent configuration:
-
LoopActionHandlersPlugin– registers the core action handlers that the runtime loop uses to process tool calls, emit events, and manage step transitions. Without this plugin, the loop cannot function. -
MaxRoundsPlugin– enforces themax_roundsstop condition configured on the agent spec. Injected with the spec’smax_roundsvalue. Prevents runaway loops.
Conditional plugins
These plugins are added only when AgentSpec.context_policy is set:
-
CompactionPlugin– manages context window compaction (summarization of old messages when the context grows too large). Created with theCompactionConfigKeysection from the spec, falling back to defaults if absent. -
ContextTransformPlugin– applies context window policy transforms (token counting, truncation, prompt caching) before each inference request. Created with thecontext_policyvalue.
Building ExecutionEnv
After the plugin list is finalized, ExecutionEnv::from_plugins() calls each plugin’s register() method with a PluginRegistrar. Plugins use the registrar to declare:
- Phase hooks (per-phase callbacks)
- Scheduled action handlers
- Effect handlers
- Request transforms
- State key registrations
- Tools
The result is an ExecutionEnv – see ExecutionEnv below.
Config validation
Plugins can declare config_schemas(), returning a list of ConfigSchema entries. Each entry associates a section key with a JSON Schema. During resolution, every declared schema is validated against the corresponding entry in AgentSpec.sections:
- Section present – validated against the JSON Schema. Failure produces
ResolveError::InvalidPluginConfig. - Section absent – allowed. Plugins are expected to use sensible defaults.
- Section present but unclaimed – no plugin declared a schema for it. The pipeline logs a warning (possible typo in configuration).
Stage 3: Tool Pipeline
The third stage collects tools from all sources and produces the final tool set.
Tool sources
Tools are merged in this order:
-
Global tools – all tools registered in
ToolRegistryvia the builder (e.g.,builder.with_tool("search", search_tool)). -
Delegate agent tools – for each agent ID in
AgentSpec.delegates, the pipeline creates anAgentTool. If the delegate has anendpoint(remote), the pipeline selects the configured remote backend. Today the built-in remote delegate backend is A2A; local delegates still use a resolver-backed local tool. Delegate tools require thea2afeature flag; without it, delegates are silently ignored with a warning. -
Plugin-registered tools – tools declared by plugins during
register(), stored inExecutionEnv.tools.
Conflict detection
If a plugin-registered tool has the same ID as a global tool, resolution fails with ResolveError::ToolIdConflict. This is intentional – silent overwriting would be a source of hard-to-debug issues.
Filtering
After merging, the spec’s allowed_tools and excluded_tools fields are applied:
allowed_tools = None– all tools are kept.allowed_tools = Some(list)– only tools whose ID appears in the list are kept. Everything else is dropped.excluded_tools– any tool whose ID appears in this list is removed, even if it was in the allow list.
ExecutionEnv
ExecutionEnv is the per-resolve product of the plugin pipeline. It is not global or shared – each resolve() call builds a fresh one. Its contents:
| Field | Type | Purpose |
|---|---|---|
phase_hooks | HashMap<Phase, Vec<TaggedPhaseHook>> | Hooks invoked at each phase boundary |
scheduled_action_handlers | HashMap<String, ScheduledActionHandlerArc> | Named handlers for scheduled/deferred actions |
effect_handlers | HashMap<String, EffectHandlerArc> | Named handlers for side effects |
request_transforms | Vec<TaggedRequestTransform> | Transforms applied to inference requests before the LLM call |
key_registrations | Vec<KeyRegistration> | State keys to install into the state store at run start |
tools | HashMap<String, Arc<dyn Tool>> | Plugin-provided tools (merged into the main tool set in Stage 3) |
plugins | Vec<Arc<dyn Plugin>> | Plugin references for lifecycle hooks (on_activate/on_deactivate) |
Each TaggedPhaseHook and TaggedRequestTransform carries its owning plugin ID for diagnostics and filtering.
AgentRuntimeBuilder
The builder (AgentRuntimeBuilder) is the standard way to construct an AgentRuntime. It accumulates five registries:
| Registry | Builder method | Purpose |
|---|---|---|
MapAgentSpecRegistry | with_agent_spec() / with_agent_specs() | Agent definitions |
MapToolRegistry | with_tool() | Global tools |
MapModelRegistry | with_model() | Model ID to provider + model name mappings |
MapProviderRegistry | with_provider() | LLM executor instances |
MapPluginSource | with_plugin() | Plugin instances |
Error handling
The builder uses deferred error collection. Each with_* call that detects a conflict (duplicate ID) pushes a BuildError onto an internal error list instead of returning Result. The first collected error surfaces when build() or build_unchecked() is called.
Validation
build() performs a dry-run resolve for every registered agent spec after constructing the runtime. If any agent fails to resolve (missing model, missing provider, missing plugin), the error is collected and returned as BuildError::ValidationFailed. This catches configuration errors at startup rather than at first request.
build_unchecked() skips this validation. Use it only when you need lazy resolution or when agents will be added dynamically after construction.
Remote agents (A2A)
When the a2a feature is enabled, the builder supports with_remote_agents() to register remote A2A endpoints. These are wrapped in a CompositeAgentSpecRegistry that combines local and remote agent sources. Remote agents are discovered asynchronously via build_and_discover().
See Also
- Architecture – system layers and request sequence
- Run Lifecycle and Phases – what happens after resolution
- Tool and Plugin Boundary – when to use tools vs plugins
- Design Tradeoffs – rationale for key decisions
State Management
Awaken provides four layers of state management, each designed for a different combination of scope, access pattern, and lifecycle. This page explains when and how to use each layer.
Overview
| Layer | Trait | Scope | Access | Lifecycle | Primary Use Case |
|---|---|---|---|---|---|
| Run State | StateKey (KeyScope::Run) | Current run only | Sync (snapshot) | Cleared at run start | Transient counters, flags, step state |
| Thread State | StateKey (KeyScope::Thread) | Same thread, cross-run | Sync (snapshot) | Auto-exported/restored across runs | Tool call state, active agent, permissions |
| Shared State | ProfileKey + StateScope | Dynamic (global, parent thread, agent type, custom) | Async (ProfileAccess) | Persistent in ProfileStore | Cross-boundary sharing, global config |
| Profile State | ProfileKey + key: &str | Per-key (agent/system) | Async (ProfileAccess) | Persistent in ProfileStore | User/agent preferences, locale |
Run State
Run state is the most transient layer. It lives entirely in memory, is accessed synchronously through a Snapshot, and is cleared to its default value when a new run begins.
Writes happen through MutationBatch, which collects updates produced by phase hooks. When multiple hooks run in parallel, the runtime uses MergeStrategy to determine whether concurrent writes to the same key can be safely merged (Commutative) or must be serialized (Exclusive). This makes run state the only layer that participates in the transactional merge protocol.
Typical examples include RunLifecycle (tracking the current run phase), PendingWorkKey (counting outstanding async work), and ContextThrottleState (rate-limiting context injection).
When to use
- Per-inference temporary state that does not need to survive beyond the current run
- State that must participate in parallel merge (
MutationBatchwithMergeStrategy) - Counters, flags, and step-tracking metadata
Example
struct StepCounter;
impl StateKey for StepCounter {
const KEY: &'static str = "step_counter";
type Value = usize;
type Update = usize;
fn apply(value: &mut Self::Value, update: Self::Update) {
*value += update;
}
}
// Register in plugin
r.register_key::<StepCounter>(StateKeyOptions::default())?;
// Read via snapshot
let count = ctx.snapshot.get::<StepCounter>().copied().unwrap_or(0);
// Write via MutationBatch
cmd.update::<StepCounter>(1);
Thread State
Thread state shares the same access model as run state – sync reads via Snapshot, transactional writes via MutationBatch, merge-safe through MergeStrategy. The difference is lifecycle: thread-scoped keys persist across runs on the same thread.
The runtime handles this transparently. At the end of a run, thread-scoped keys are exported (serialized). When the next run starts on the same thread, they are restored to their previous values instead of being reset to defaults. From the hook author’s perspective, the key simply “remembers” its value between runs.
Typical examples include ToolCallStates (for resuming suspended tool calls across runs) and ActiveAgentKey (for persisting agent handoff state).
When to use
- State that must survive across runs on the same thread
- State that needs sync access and transactional merge guarantees within each run
- State whose lifecycle should be managed automatically by the runtime
Example
r.register_key::<ToolCallStates>(StateKeyOptions {
scope: KeyScope::Thread,
persistent: true,
..StateKeyOptions::default()
})?;
Shared State
Shared state is a persistent, async layer built on the ProfileStore backend. It is designed for data that must cross thread and agent boundaries – something neither run state nor thread state can do.
Shared state uses ProfileKey to bind a compile-time namespace to a value type, and a key: &str parameter to identify the runtime instance. Together, (ProfileKey::KEY, key) uniquely identifies a shared state entry. Different agents and threads can read and write the same entry if they use the same key string. The same ProfileAccess methods (read, write, delete) serve both shared and profile state — they all take key: &str.
Because shared state bypasses the snapshot/mutation-batch workflow, it does not participate in transactional merge. Concurrent writes follow last-write-wins semantics. Access is async through ProfileAccess, available in PhaseContext.
StateScope – convenience key builder
| Constructor | Key String | Scenario |
|---|---|---|
StateScope::global() | "global" | All agents share one instance |
StateScope::parent_thread(id) | "parent_thread::{id}" | Parent and child agents share within a delegation tree |
StateScope::agent_type(name) | "agent_type::{name}" | All instances of an agent type share |
StateScope::thread(id) | "thread::{id}" | Thread-local persistent state |
StateScope::new(s) | "{s}" | Arbitrary grouping (tenant, region, etc.) |
The key is a plain &str – fully extensible without code changes. StateScope is an optional convenience; any raw string works.
When to use
- State shared across thread boundaries
- State shared across agent boundaries
- Dynamic scoping that cannot be determined at compile time
- Data that serves as database-like indexed storage
Example
use awaken_contract::ProfileKey;
struct TeamContextKey;
impl ProfileKey for TeamContextKey {
const KEY: &'static str = "team_context";
type Value = TeamData;
}
// In a hook -- share context with child agents
let scope = StateScope::parent_thread(&ctx.run_identity.parent_thread_id.unwrap());
let mut team = access.read::<TeamContextKey>(scope.as_str()).await?;
team.goals.push("new goal".into());
access.write::<TeamContextKey>(scope.as_str(), &team).await?;
Profile State
Profile state is a persistent, async layer for per-entity preferences. Like shared state, it uses the ProfileStore backend and is accessed through ProfileAccess. The difference is the key convention: instead of a StateScope string, profile state typically uses an agent name or "system" as the key.
A ProfileKey binds a static namespace string to a value type. The key parameter identifies which agent or system entity the data belongs to.
When to use
- Per-agent persistent preferences (locale, display name, custom settings)
- System-level configuration shared across all agents
- Data belonging to a specific agent identity rather than a dynamic group
Example
struct Locale;
impl ProfileKey for Locale {
const KEY: &'static str = "locale";
type Value = String;
}
let locale = access.read::<Locale>("alice").await?;
access.write::<Locale>("system", &"en-US".into()).await?;
Decision Guide
Need state during a single run?
+-- Yes, sync + transactional --> Run State (StateKey, KeyScope::Run)
+-- No, needs to persist
+-- Same thread only, sync + transactional --> Thread State (StateKey, KeyScope::Thread)
+-- Cross-boundary, dynamic key --> Shared State (ProfileKey + StateScope)
+-- Per-agent/user preference --> Profile State (ProfileKey + agent/system key)
Comparison: Shared State vs Thread State
Both persist data across runs. The key differences:
| Aspect | Thread State | Shared State |
|---|---|---|
| Access | Sync (snapshot) | Async (ProfileAccess) |
| Scope | Fixed to current thread | Dynamic (any string) |
| Merge safety | MutationBatch + strategy | Last-write-wins |
| Cross-boundary | No | Yes |
| Lifecycle | Auto export/restore | Always persistent |
Use Thread State when you need sync access and transactional guarantees within a run. Use Shared State when you need cross-boundary sharing or dynamic scoping.
See Also
- State and Snapshot Model – internal architecture
- State Keys – API reference
- Use Shared State – practical how-to
State and Snapshot Model
Awaken uses a typed state engine with snapshot isolation. This page explains the state primitives, scoping rules, merge strategies, and the mutation lifecycle.
StateKey Trait
Every piece of runtime state is declared as a type implementing StateKey:
pub trait StateKey: 'static + Send + Sync {
const KEY: &'static str;
const MERGE: MergeStrategy = MergeStrategy::Exclusive;
const SCOPE: KeyScope = KeyScope::Run;
type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
type Update: Send + 'static;
fn apply(value: &mut Self::Value, update: Self::Update);
}
A StateKey is a zero-sized type that associates a string key, a value type, an update type, and a merge function. The apply method defines how an update transforms the current value. Serialization is handled through encode/decode with JSON as the interchange format.
Plugins register their state keys through PluginRegistrar::register_key::<K>() during the Plugin::register callback.
KeyScope
pub enum KeyScope {
Run, // Cleared at run start (default)
Thread, // Persists across runs on the same thread
}
Run-scoped keys are reset to their default value when a new run begins. Use this for per-run counters, step state, and transient execution metadata.
Thread-scoped keys survive across runs on the same thread. Use this for conversation memory, accumulated context, and persistent agent preferences.
MergeStrategy
pub enum MergeStrategy {
Exclusive, // Conflict on concurrent writes (default)
Commutative, // Order-independent updates, safe to merge
}
When multiple hooks run in the same phase and produce MutationBatch values that touch the same key:
-
Exclusive – the batches cannot be merged. The runtime detects the conflict and falls back to sequential execution. This is the safe default for keys where write order matters.
-
Commutative – the update operations can be applied in any order and produce the same result. The runtime concatenates the operations from both batches. Use this for append-only logs, counters, and set unions.
Snapshot
A Snapshot is an immutable view of the state at a point in time. Phase hooks receive a snapshot reference and can read any registered key’s value. They cannot mutate the snapshot directly.
Phase hook receives: &Snapshot
Phase hook writes to: MutationBatch
This separation guarantees that hooks within the same phase see identical state regardless of execution order.
MutationBatch
A MutationBatch collects state updates produced by a single hook invocation:
pub struct MutationBatch {
base_revision: Option<u64>,
ops: Vec<Box<dyn MutationOp>>,
touched_keys: Vec<String>,
}
Each operation in ops is a type-erased KeyPatch<K> that carries the K::Update value. When the batch is applied, each operation calls K::apply(value, update) on the target key in the state map.
The touched_keys list enables conflict detection for Exclusive keys during parallel merge.
Mutation Lifecycle
1. Phase starts
2. Runtime takes a Snapshot (immutable clone)
3. Each hook reads from the Snapshot, produces a MutationBatch
4. All hooks complete (phase convergence)
5. Runtime checks MutationBatch key overlap:
- Exclusive keys overlap -> conflict (sequential fallback)
- Commutative keys overlap -> ops concatenated
- No overlap -> batches merged
6. Merged batch applied atomically to the live state
7. New Snapshot taken for the next phase
StateMap
StateMap is the runtime’s typed state container. It uses typedmap::TypedMap internally, keyed by zero-sized TypedKey<K> wrappers that hash and compare by TypeId. The get::<K>() and get_or_insert_default::<K>() methods retrieve the concrete K::Value type directly without downcasting.
State keys registered with persistent: true in StateKeyOptions are serialized during checkpoint and restored on thread load. Non-persistent keys exist only in memory for the duration of the run.
StateStore
StateStore wraps the StateMap and provides:
- Snapshot creation (cheap
Arcclone of the inner map) - Batch application with revision tracking
- Commit hooks (
CommitHook) that fire after each successful state mutation StateCommandprocessing for programmatic state operations
Shared State
For state that must be shared across thread and agent boundaries, awaken provides a separate persistent layer built on the same ProfileStore backend used by profile data.
ProfileKey for shared state
Shared state uses the same ProfileKey trait as profile data. The KEY constant serves as the namespace, and the key: &str parameter passed to ProfileAccess methods identifies the runtime instance:
pub trait ProfileKey: 'static + Send + Sync {
const KEY: &'static str;
type Value: Clone + Default + Serialize + DeserializeOwned + Send + Sync + 'static;
}
Unlike StateKey, shared state does not participate in the MutationBatch / snapshot workflow – it is accessed asynchronously through ProfileAccess with a key: &str parameter.
StateScope
pub struct StateScope(String);
An optional convenience builder for common key string patterns. Constructors: global(), parent_thread(id), agent_type(name), thread(id), new(arbitrary). Call .as_str() to get the key string for ProfileAccess methods. Users can also pass any raw &str directly.
State Management Overview
| Layer | Scope | Access | Lifecycle | Use Case |
|---|---|---|---|---|
| Run State | Current run | Sync snapshot | Cleared at run start | Transient flags, counters |
| Thread State | Same thread | Sync snapshot | Export/restore across runs | Tool call state, active agent |
| Shared State | ProfileKey + StateScope | Async | Persistent | Cross-boundary sharing |
| Profile State | ProfileKey + key: &str | Async | Persistent | User/agent preferences |
Choose shared state when you need dynamic scoping or cross-boundary access. Choose StateKey when you need sync access or transactional merge during parallel tool execution.
See Also
Run Lifecycle and Phases
This page describes the state machines that govern run execution and tool call processing, the phase enum, termination conditions, and checkpoint triggers.
RunStatus
A run’s coarse lifecycle is captured by RunStatus:
Running --+--> Waiting --+--> Running (resume)
| |
+--> Done +--> Done
pub enum RunStatus {
Running, // Actively executing (default)
Waiting, // Paused, waiting for external decisions
Done, // Terminal -- cannot transition further
}
Running -> Waiting: a tool call suspends, the run pauses for external input.Waiting -> Running: decisions arrive, the run resumes.Running -> DoneorWaiting -> Done: terminal transition on completion, cancellation, or error.Done -> *: not allowed. Terminal state.
ToolCallStatus
Each tool call in a run has its own lifecycle:
New --> Running --+--> Succeeded (terminal)
+--> Failed (terminal)
+--> Cancelled (terminal)
+--> Suspended --> Resuming --+--> Running
+--> Suspended (re-suspend)
+--> Succeeded/Failed/Cancelled
pub enum ToolCallStatus {
New, // Created, not yet executing
Running, // Currently executing
Suspended, // Waiting for external decision
Resuming, // Decision received, about to re-execute
Succeeded, // Completed successfully (terminal)
Failed, // Completed with error (terminal)
Cancelled, // Cancelled externally (terminal)
}
Key transitions:
Suspendedcan only move toResumingorCancelled– it cannot jump directly toRunningor a success/failure state.Resuminghas wide transitions: it can re-enterRunning, re-suspend, or reach any terminal state.- Terminal states (
Succeeded,Failed,Cancelled) cannot transition to any non-self state.
Phase Enum
The Phase enum defines the eight execution phases in order:
pub enum Phase {
RunStart,
StepStart,
BeforeInference,
AfterInference,
BeforeToolExecute,
AfterToolExecute,
StepEnd,
RunEnd,
}
RunStart – fires once at the beginning of a run. Plugins initialize run-scoped state.
StepStart – fires at the beginning of each inference round. Step counter increments.
BeforeInference – last chance to modify the inference request (system prompt, tools, parameters). Plugins can skip inference by setting a behavior flag.
AfterInference – fires after the LLM response arrives. Plugins can inspect the response, modify tool call lists, or request termination.
BeforeToolExecute – fires before each tool call batch. Permission checks, interception, and suspension happen here.
AfterToolExecute – fires after tool results are available. Plugins can inspect results and trigger side effects.
StepEnd – fires at the end of each inference round. Checkpoint persistence happens here. Stop conditions (max rounds, token budget, loop detection) are evaluated.
RunEnd – fires once when the run terminates, regardless of reason. Cleanup and final state persistence.
TerminationReason
When a run ends, the TerminationReason records why:
pub enum TerminationReason {
NaturalEnd, // LLM returned no tool calls
BehaviorRequested, // A plugin requested inference skip
Stopped(StoppedReason), // A stop condition fired (code + optional detail)
Cancelled, // External cancellation signal
Blocked(String), // Permission checker blocked the run
Suspended, // Waiting for external tool-call resolution
Error(String), // Error path
}
TerminationReason::to_run_status() maps each variant to the appropriate RunStatus:
Suspendedmaps toRunStatus::Waiting(the run can resume).- All other variants map to
RunStatus::Done.
Stop Conditions
Declarative stop conditions are configured per agent via StopConditionSpec:
| Variant | Trigger |
|---|---|
MaxRounds { rounds } | Step count exceeds limit |
Timeout { seconds } | Wall-clock time exceeds limit |
TokenBudget { max_total } | Cumulative token usage exceeds budget |
ConsecutiveErrors { max } | Sequential tool errors exceed threshold |
StopOnTool { tool_name } | A specific tool is called |
ContentMatch { pattern } | LLM output matches a regex pattern |
LoopDetection { window } | Repeated identical tool calls within a sliding window |
Stop conditions are evaluated at StepEnd. When one fires, the run terminates with TerminationReason::Stopped.
Checkpoint Triggers
State is persisted at StepEnd after each inference round. The checkpoint includes:
- Thread messages (append-only)
- Run lifecycle state (
RunStatus, step count, termination reason) - Persistent state keys (those registered with
persistent: true) - Tool call states for suspended calls
Checkpoints enable resume from the last completed step after a crash or intentional suspension.
RunStatus Derived from ToolCall States
A run’s status is a projection of all its tool call states. Each tool call has an independent lifecycle; the run status is the aggregate:
fn derive_run_status(calls: &HashMap<String, ToolCallState>) -> RunStatus {
let mut has_suspended = false;
for state in calls.values() {
match state.status {
// Any Running or Resuming call → Run is still executing
ToolCallStatus::Running | ToolCallStatus::Resuming => {
return RunStatus::Running;
}
ToolCallStatus::Suspended => {
has_suspended = true;
}
// Succeeded / Failed / Cancelled are terminal — keep checking
_ => {}
}
}
if has_suspended {
RunStatus::Waiting // No executing calls, but some await decisions
} else {
RunStatus::Done // All calls in terminal state → step complete
}
}
Decision table:
Any Running/Resuming? | Any Suspended? | Run Status | Meaning |
|---|---|---|---|
| Yes | — | Running | Tools are actively executing |
| No | Yes | Waiting | All execution done, awaiting external decisions |
| No | No | Done | All calls terminal → proceed to next step |
Parallel tool call state timeline
When an LLM returns multiple tool calls (e.g. [tool_A, tool_B, tool_C]), their
states evolve independently:
Time tool_A(approval-req) tool_B(approval-req) tool_C(normal) → Run Status
────────────────────────────────────────────────────────────────
t0 Created Created Created Running Step starts
t1 Suspended Created Running Running tool_A intercepted
t2 Suspended Suspended Running Running tool_B intercepted, tool_C executing
t3 Suspended Suspended Succeeded Waiting tool_C done, no Running calls
t4 Resuming Suspended Succeeded Running tool_A decision arrives
t5 Succeeded Suspended Succeeded Waiting tool_A replay done
t6 Succeeded Resuming Succeeded Running tool_B decision arrives
t7 Succeeded Succeeded Succeeded Done All terminal → next step
At every transition the run status is re-derived from the aggregate of all call
states. This means a single decision arriving does not end the wait — the run
stays in Waiting until all suspended calls are resolved.
Suspension Bridges Run and Tool-Call Layers
Current execution model (serial phases)
Tool execution is split into two serial phases inside
execute_tools_with_interception:
Phase 1 — Intercept (serial, per-call):
for each call:
BeforeToolExecute hooks → check for intercept actions
Suspend? → mark Suspended, set suspended=true, continue
Block? → mark Failed, return immediately
SetResult → mark with provided result, continue
None → add to allowed_calls
Phase 2 — Execute (allowed_calls only):
Sequential mode: one by one, break on first suspension
Parallel mode: batch execute, collect all results
After both phases, if suspended == true, the step returns
StepOutcome::Suspended. The orchestrator then:
- Persists checkpoint (messages, tool call states)
- Emits
RunFinish(Suspended)to protocol encoders - Enters
wait_for_resume_or_cancelloop
wait_for_resume_or_cancel loop
loop {
let decisions = decision_rx.next().await; // block until decisions arrive
emit_decision_events_and_messages(decisions);
prepare_resume(decisions); // Suspended → Resuming
detect_and_replay_resume(); // re-execute Resuming calls
if !has_suspended_calls() {
return WaitOutcome::Resumed; // all resolved → exit wait
}
// Some calls still Suspended → continue waiting
}
Key properties:
- The loop handles partial resume: if only tool_A’s decision arrives but tool_B is still suspended, tool_A is replayed immediately and the loop continues waiting for tool_B.
- Decisions can arrive in batches or one at a time.
- On
WaitOutcome::Resumed, the orchestrator re-enters the step loop for the next LLM inference round.
Resume replay
detect_and_replay_resume scans ToolCallStates for calls with
status == Resuming and re-executes them through the standard tool pipeline.
The arguments field already reflects the resume mode (set by
prepare_resume):
| Resume Mode | Arguments on Replay | Behavior |
|---|---|---|
ReplayToolCall | Original arguments | Full re-execution |
UseDecisionAsToolResult | Decision result | FrontendToolPlugin intercepts in BeforeToolExecute, returns SetResult |
PassDecisionToTool | Decision result | Tool receives decision as arguments |
Already-completed calls (Succeeded, Failed, Cancelled) are skipped.
Limitation: decisions during execution
In the current serial model, decisions that arrive while Phase 2 tools are
still executing sit in the channel buffer. They are only consumed when the step
finishes and the orchestrator enters wait_for_resume_or_cancel.
This means:
- tool_A’s approval arrives at t2 (while tool_C is executing)
- tool_A is not replayed until t3 (after tool_C finishes)
- The delay equals the remaining execution time of Phase 2 tools
Concurrent Execution Model (future)
The ideal model executes suspended-tool waits and allowed-tool execution in parallel, so a decision for tool_A can trigger immediate replay even while tool_C is still running.
Architecture
Phase 1 — Intercept (same as current)
Phase 2 — Concurrent execution:
┌─ task: execute(tool_C) ──────────────────────────┐
│ │
├─ task: execute(tool_D) ────────────┐ │
│ │ │
├─ task: wait_decision(tool_A) → replay(tool_A) ──┐│
│ ││
├─ task: wait_decision(tool_B) ──────────→ replay(tool_B)
│ │
└─ barrier: all tasks reach terminal state ────────┘
Per-call decision routing
The shared decision_rx channel carries batches of decisions for multiple
tool calls. A dispatcher task demuxes decisions to per-call notification
channels:
struct ToolCallWaiter {
waiters: HashMap<String, oneshot::Sender<ToolCallResume>>,
}
impl ToolCallWaiter {
async fn dispatch_loop(&mut self, decision_rx: &mut UnboundedReceiver<DecisionBatch>) {
while let Some(batch) = decision_rx.next().await {
for (call_id, resume) in batch {
if let Some(tx) = self.waiters.remove(&call_id) {
let _ = tx.send(resume);
}
}
if self.waiters.is_empty() { break; }
}
}
}
Each suspended tool call gets a oneshot::Receiver. When its decision arrives,
the receiver wakes the task, which runs the replay immediately — concurrently
with any still-executing allowed tools.
State transition timing
With the concurrent model, state transitions happen as events occur rather than in batches:
t0: tool_C starts executing → RunStatus: Running
t1: tool_A decision arrives, replay → RunStatus: Running (tool_A Resuming)
t2: tool_A replay completes → RunStatus: Running (tool_C still Running)
t3: tool_C completes → RunStatus: Waiting (tool_B still Suspended)
t4: tool_B decision arrives, replay → RunStatus: Running (tool_B Resuming)
t5: tool_B replay completes → RunStatus: Done (all terminal)
No artificial delay — each tool call progresses as fast as its external dependency allows.
Protocol Adapter: SSE Reconnection
A backend run may span multiple frontend SSE connections. This is especially relevant for the AI SDK v6 protocol, where each HTTP request corresponds to one “turn” and produces one SSE stream.
Problem
Turn 1 (user message):
HTTP POST → SSE stream 1 → events flow → tool suspends
→ RunFinish(Suspended) → finish event → SSE stream 1 closes
The run is still alive in wait_for_resume_or_cancel.
But the event_tx channel from SSE stream 1 has been dropped.
Turn 2 (tool output / approval):
HTTP POST → SSE stream 2 → decision delivered to orchestrator
→ orchestrator resumes → emits events
→ events go to... the dropped event_tx? Lost.
Solution: ReconnectableEventSink
Replace the fixed ChannelEventSink with a reconnectable wrapper that allows
swapping the underlying channel sender:
struct ReconnectableEventSink {
inner: Arc<tokio::sync::Mutex<mpsc::UnboundedSender<AgentEvent>>>,
}
impl ReconnectableEventSink {
fn new(tx: mpsc::UnboundedSender<AgentEvent>) -> Self {
Self { inner: Arc::new(tokio::sync::Mutex::new(tx)) }
}
/// Replace the underlying channel. Called when a new SSE connection
/// arrives for an existing suspended run.
async fn reconnect(&self, new_tx: mpsc::UnboundedSender<AgentEvent>) {
*self.inner.lock().await = new_tx;
}
}
#[async_trait]
impl EventSink for ReconnectableEventSink {
async fn emit(&self, event: AgentEvent) {
let _ = self.inner.lock().await.send(event);
}
}
Reconnection flow
Turn 1:
submit() → create (event_tx1, event_rx1)
→ ReconnectableEventSink(event_tx1)
→ spawn_execution (run starts)
events → event_tx1 → event_rx1 → SSE stream 1
tool suspends → finish(tool-calls) → SSE stream 1 closes
event_tx1 still held by ReconnectableEventSink (sends fail silently)
run alive in wait_for_resume_or_cancel
Turn 2:
new HTTP request with decisions arrives
create (event_tx2, event_rx2)
sink.reconnect(event_tx2) ← swap channel
send_decision → decision channel → orchestrator resumes
events → ReconnectableEventSink → event_tx2 → event_rx2 → SSE stream 2
run completes → RunFinish(NaturalEnd) → SSE stream 2 closes
No events are lost between SSE connections because:
- During suspend, the orchestrator is blocked in
wait_for_resume_or_canceland emits no events. reconnect()completes beforesend_decision(), so the first resume event (RunStart) goes to the new channel.
Protocol-specific behavior
| Protocol | Suspend Signal | Resume Mechanism |
|---|---|---|
| AI SDK v6 | finish(finishReason: "tool-calls") | New HTTP POST → reconnect → send_decision |
| AG-UI | RUN_FINISHED(outcome: "interrupt") | New HTTP POST → reconnect → send_decision |
| CopilotKit | renderAndWaitForResponse UI | Same SSE or new request via AG-UI |
See Also
- HITL and Mailbox – suspension, resume, and decision handling
- Tool Execution Modes – Sequential vs Parallel execution
- State and Snapshot Model – how state is read and written during phases
- Architecture – three-layer overview
- Cancellation – auto-cancellation on new message
HITL and Mailbox
This page explains how Awaken handles human-in-the-loop (HITL) interactions through tool call suspension and the mailbox queue that manages run requests.
SuspendTicket
When a tool call needs external approval or input, it produces a SuspendTicket:
pub struct SuspendTicket {
pub suspension: Suspension,
pub pending: PendingToolCall,
pub resume_mode: ToolCallResumeMode,
}
suspension – the external-facing payload describing what input is needed:
pub struct Suspension {
pub id: String, // Unique suspension ID
pub action: String, // Action identifier (e.g., "confirm", "approve")
pub message: String, // Human-readable prompt
pub parameters: Value, // Action-specific parameters
pub response_schema: Option<Value>, // JSON Schema for expected response
}
pending – the tool call projection emitted to the event stream so the frontend knows which call is waiting:
pub struct PendingToolCall {
pub id: String, // Tool call ID
pub name: String, // Tool name
pub arguments: Value, // Original arguments
}
resume_mode – how the agent loop should handle the decision when it arrives.
ToolCallResumeMode
pub enum ToolCallResumeMode {
ReplayToolCall, // Re-execute the original tool call
UseDecisionAsToolResult, // Use the decision payload as the tool result directly
PassDecisionToTool, // Pass the decision payload into the tool as new arguments
}
ReplayToolCall is the default. The original tool call is re-executed after the decision arrives. Use this when the decision unlocks execution (e.g., permission granted, now run the tool).
UseDecisionAsToolResult skips re-execution entirely. The external decision payload becomes the tool result. Use this when a human provides the answer directly (e.g., “the correct value is X”).
PassDecisionToTool re-executes the tool but injects the decision payload into the arguments. Use this when the decision modifies how the tool should run (e.g., “use this alternative path instead”).
ResumeDecisionAction
pub enum ResumeDecisionAction {
Resume, // Proceed with the tool call
Cancel, // Cancel the tool call
}
Each decision carries an action. Resume continues execution according to the ToolCallResumeMode. Cancel transitions the tool call to ToolCallStatus::Cancelled.
ToolCallResume
The full resume payload:
pub struct ToolCallResume {
pub decision_id: String, // Idempotency key
pub action: ResumeDecisionAction,
pub result: Value, // Decision payload
pub reason: Option<String>, // Human-readable reason
pub updated_at: u64, // Unix millis timestamp
}
Permission Plugin and Ask-Mode
The awaken-ext-permission plugin uses suspension to implement ask-mode approvals:
- A tool call matches a permission rule with
behavior: ask. - The permission checker creates a
SuspendTicketwithToolCallResumeMode::ReplayToolCall. - The suspension payload describes the tool and its arguments.
- The tool call transitions to
ToolCallStatus::Suspended. - The run transitions to
RunStatus::Waiting. - A frontend presents the approval prompt to the user.
- The user submits a
ToolCallResumewithResumeDecisionAction::ResumeorCancel. - On
Resume, the tool call replays and executes normally. - On
Cancel, the tool call is cancelled and the run may continue with remaining calls.
Mailbox Architecture
The mailbox provides a persistent queue for all run requests. Every run – streaming, background, A2A, internal – enters the system as a MailboxJob.
MailboxJob
pub struct MailboxJob {
// Identity
pub job_id: String, // UUID v7
pub mailbox_id: String, // Thread ID (routing anchor)
// Request payload
pub agent_id: String,
pub messages: Vec<Message>,
pub origin: MailboxJobOrigin,
pub sender_id: Option<String>,
pub parent_run_id: Option<String>,
pub request_extras: Option<Value>,
// Queue semantics
pub priority: u8, // 0 = highest, 255 = lowest, default 128
pub dedupe_key: Option<String>,
pub generation: u64,
// Lifecycle
pub status: MailboxJobStatus,
pub available_at: u64,
pub attempt_count: u32,
pub max_attempts: u32,
pub last_error: Option<String>,
// Lease
pub claim_token: Option<String>,
pub claimed_by: Option<String>,
pub lease_until: Option<u64>,
// Timestamps
pub created_at: u64,
pub updated_at: u64,
}
MailboxJobStatus
Queued --claim--> Claimed --ack--> Accepted (terminal)
| |
| nack(retry) --> Queued (attempt_count++, available_at = retry_at)
| |
| nack(permanent) --> DeadLetter (terminal)
|
|-- cancel --> Cancelled (terminal)
+-- interrupt(generation bump) --> Superseded (terminal)
pub enum MailboxJobStatus {
Queued, // Waiting to be claimed
Claimed, // Claimed by a consumer, executing
Accepted, // Successfully processed (terminal)
Cancelled, // Cancelled by caller (terminal)
Superseded, // Replaced by a newer generation (terminal)
DeadLetter, // Permanently failed (terminal)
}
MailboxJobOrigin
pub enum MailboxJobOrigin {
User, // HTTP API, SDK
A2A, // Agent-to-Agent protocol
Internal, // Child run notification, handoff
}
MailboxStore Trait
MailboxStore defines the persistent queue interface:
- enqueue – persist a job, auto-assign generation, reject duplicate
dedupe_key - claim – atomically claim up to N
Queuedjobs for a mailbox (lease-based) - claim_job – claim a specific job by ID (for inline streaming)
- ack – mark a job as
Accepted(validates claim token) - nack – return a job to
Queuedfor retry, orDeadLetterif max attempts exceeded - cancel – cancel a
Queuedjob - extend_lease – heartbeat to extend an active claim
- interrupt – atomically bump generation, supersede stale
Queuedjobs, return the activeClaimedjob for cancellation
Implementations must guarantee: durable enqueue, atomic claim (exactly one winner), claim token validation on ack/nack, and atomic interrupt with generation bump.
MailboxInterrupt
When a new high-priority request arrives for a thread that already has queued or running work:
pub struct MailboxInterrupt {
pub new_generation: u64, // Generation after bump
pub active_job: Option<MailboxJob>, // Currently running job to cancel
pub superseded_count: usize, // Queued jobs superseded
}
The caller cancels the active_job’s runtime run if present, ensuring the new request takes priority.
See Also
- Run Lifecycle and Phases – how suspension bridges run and tool-call layers
- Enable Tool Permission HITL – practical setup guide
Multi-Agent Patterns
Awaken supports multiple patterns for composing agents. This page describes delegation, remote agents, sub-agent execution, and handoff.
Agent Delegation via AgentSpec.delegates
An agent can declare sub-agents it is allowed to delegate to:
{
"id": "orchestrator",
"model": "gpt-4o",
"system_prompt": "You coordinate tasks across specialized agents.",
"delegates": ["researcher", "writer", "reviewer"]
}
Each ID in delegates must be a registered agent in the AgentSpecRegistry. During resolution, the runtime creates an AgentTool for each delegate. From the LLM’s perspective, each sub-agent appears as a regular tool named agent_run_{delegate_id}.
When the LLM calls a delegate tool, the AgentTool dispatches to the appropriate backend:
- Local agents (no
endpointfield) useLocalBackend, which resolves and executes the sub-agent inline within the same runtime. - Remote agents (with
endpointfield) useA2aBackend, which sends an A2Amessage:sendrequest and polls the resulting task for completion.
Remote Agents via A2A
Remote agents are declared with an endpoint in AgentSpec:
{
"id": "remote-analyst",
"model": "unused-for-remote",
"system_prompt": "",
"endpoint": {
"backend": "a2a",
"base_url": "https://analyst.example.com/v1/a2a",
"auth": { "type": "bearer", "token": "token-abc" },
"target": "analyst",
"timeout_ms": 300000,
"options": {
"poll_interval_ms": 1000
}
}
}
The A2aBackend handles the A2A protocol lifecycle:
- Sends a
message:sendrequest with the user message. - Receives a task wrapper, extracts
task.id, and polls/tasks/:task_idat the configured interval. - Returns the completed response as a
DelegateRunResult. - The result is formatted as a
ToolResultand returned to the parent agent’s LLM context.
If the remote agent times out or fails, the DelegateRunStatus reflects the failure and the parent agent receives an error tool result.
Sub-Agent Patterns
Sequential Delegation
The orchestrator calls sub-agents one at a time, using each result to decide the next step:
Orchestrator -> researcher (tool call) -> result
-> writer (tool call, using researcher output) -> result
-> reviewer (tool call, using writer output) -> result
Each delegation is a tool call within the orchestrator’s step loop. The orchestrator sees tool results and decides whether to delegate further or respond directly.
Parallel Delegation
When the LLM emits multiple tool calls in a single inference response, sub-agent delegations can execute concurrently. The runtime processes tool calls in parallel by default – if the LLM calls both agent_run_researcher and agent_run_writer in the same response, both execute simultaneously.
Nested Delegation
Sub-agents can themselves have delegates, creating hierarchies:
orchestrator
-> team_lead (delegates: [dev_a, dev_b])
-> dev_a
-> dev_b
Each level resolves independently through the AgentResolver. There is no hard depth limit, but each level adds latency and token cost.
Agent Handoff
Handoff transfers control from one agent to another mid-run without stopping the loop. The mechanism:
- A plugin (or the handoff extension) writes a new agent ID to the
ActiveAgentKeystate key. - At the next step boundary, the loop runner detects the changed key.
- The loop re-resolves the agent from the
AgentResolver– new config, new model, new tools, new system prompt. - Execution continues in the same run with the new agent’s configuration.
Handoff is a re-resolve, not a loop restart. Thread history is preserved. The new agent sees all prior messages and can continue the conversation seamlessly.
Handoff vs Delegation
| Aspect | Delegation | Handoff |
|---|---|---|
| Control flow | Parent calls sub-agent as tool, gets result back | Control transfers entirely to new agent |
| Thread continuity | Sub-agent may use a separate thread context | Same thread, same message history |
| Return path | Result flows back to parent LLM | No return – new agent owns the run |
| Use case | Task decomposition, specialized subtasks | Role switching, escalation, routing |
AgentBackend Trait
Both local and remote delegation use the AgentBackend trait:
pub trait AgentBackend: Send + Sync {
async fn execute(
&self,
agent_id: &str,
messages: Vec<Message>,
event_sink: Arc<dyn EventSink>,
parent_run_id: Option<String>,
parent_tool_call_id: Option<String>,
) -> Result<DelegateRunResult, AgentBackendError>;
}
DelegateRunResult carries the agent ID, terminal status, optional response text, and step count. DelegateRunStatus variants: Completed, Failed(String), Cancelled, Timeout.
This trait is the extension point for custom delegation backends beyond local and A2A.
See Also
- A2A Protocol Reference – wire protocol details
- Architecture – runtime and resolver layers
- Tool and Plugin Boundary – where delegation tools fit
Tool and Plugin Boundary
Awaken separates user-facing capabilities (tools) from system-level lifecycle logic (plugins). This page explains the design boundary, when to use each, and how they interact.
Tools
A tool is a capability exposed to the LLM. The LLM decides when to call it based on the tool’s descriptor.
pub trait Tool: Send + Sync {
fn descriptor(&self) -> ToolDescriptor;
fn validate_args(&self, args: &Value) -> Result<(), ToolError> { Ok(()) }
async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError>;
}
Key properties:
- Registered by ID – each tool has a unique string identifier from
ToolDescriptor. - LLM-visible – the descriptor (name, description, JSON Schema for parameters) is included in inference requests.
- Executed in tool rounds – tools run during the
BeforeToolExecute/AfterToolExecutephase window. - Isolated – a tool receives only its arguments and a
ToolCallContext. It cannot directly access other tools, the plugin system, or the phase runtime. - User-defined – application code creates tools for domain-specific actions (file operations, API calls, database queries).
Plugins
A plugin is a system-level extension that hooks into the execution lifecycle. Plugins do not appear in the LLM’s tool list.
pub trait Plugin: Send + Sync + 'static {
fn descriptor(&self) -> PluginDescriptor;
fn register(&self, registrar: &mut PluginRegistrar) -> Result<(), StateError>;
fn config_schemas(&self) -> Vec<ConfigSchema> { vec![] }
}
Key properties:
- Registered by ID – each plugin has a unique identifier from
PluginDescriptor. - LLM-invisible – the LLM does not know plugins exist.
- Run in phases – plugin hooks fire at phase boundaries (RunStart, BeforeInference, AfterToolExecute, etc.).
- System-wide access – hooks receive
PhaseContextwith access to the state snapshot, event sink, and run metadata. - Declarative registration – plugins declare their capabilities through
PluginRegistrar.
PluginRegistrar
When Plugin::register is called, the plugin uses PluginRegistrar to declare everything it needs:
| Registration Method | Purpose |
|---|---|
register_key::<K>() | Declare a StateKey for the plugin’s state |
register_phase_hook() | Add a hook that runs at a specific phase |
register_tool() | Inject a tool into the runtime (plugin-provided tools) |
register_effect_handler() | Handle named effects emitted by hooks |
register_scheduled_action() | Handle named actions scheduled by hooks |
register_request_transform() | Transform inference requests before they reach the LLM |
Plugins can register tools. This is how delegation tools (AgentTool) and MCP tools enter the runtime – they are tools owned by plugins, not directly registered by user code. The LLM sees these tools identically to user-registered tools.
When to Use a Tool
Use a tool when:
- The LLM should be able to invoke the capability by name.
- The capability performs a domain-specific action (search, compute, file I/O, API call).
- The capability takes structured input and returns a result the LLM can reason about.
- The logic does not need access to runtime internals (state, phases, other tools).
When to Use a Plugin
Use a plugin when:
- The logic should run automatically at phase boundaries without LLM involvement.
- The logic needs to modify inference requests (system prompt injection, tool filtering).
- The logic needs to inspect or transform tool results before the LLM sees them.
- The logic manages cross-cutting concerns (permissions, observability, reminders, state synchronization).
- The logic needs to register state keys, effect handlers, or request transforms.
Interaction Between Tools and Plugins
Plugins can influence tool execution without the tool knowing:
- Permission plugin – intercepts tool calls at
BeforeToolExecute, blocks or suspends them based on policy rules. The tool itself has no permission logic. - Observability plugin – wraps tool execution with OpenTelemetry spans. The tool does not emit traces.
- Reminder plugin – injects context messages after specific tools execute. The tool does not know about reminders.
- Interception pipeline – plugins can modify tool arguments, replace tool results, or skip execution entirely through the tool interception mechanism.
This separation means tools remain simple and focused on their domain logic. Cross-cutting concerns are handled uniformly by plugins, applied consistently across all tools without per-tool opt-in.
Plugin-Provided Tools vs User Tools
| Aspect | User Tool | Plugin Tool |
|---|---|---|
| Registration | AgentRuntimeBuilder::with_tool() | PluginRegistrar::register_tool() |
| Lifecycle | Exists for the runtime’s lifetime | Exists while the plugin is active |
| Configuration | Direct construction | Derived from plugin config or agent spec |
| Examples | Custom business logic tools | AgentTool (delegation), MCP tools, skill tools |
Both appear identically to the LLM. The distinction is purely about ownership and lifecycle management.
See Also
Plugin System Internals
This page covers the internal mechanics of the plugin system: how plugins are registered and activated, how hooks execute and resolve conflicts, how the phase convergence loop works, and how request transforms, effects, inference overrides, and tool intercepts behave at runtime.
For the high-level boundary between tools and plugins, see Tool and Plugin Boundary. For the phase lifecycle, see Run Lifecycle and Phases.
Plugin Registration vs Activation
When a plugin is loaded, its register() method is called with a PluginRegistrar. The plugin declares two categories of components:
Structural components are always available regardless of activation state:
- State keys (
register_key::<K>()) - Scheduled action handlers (
register_scheduled_action::<A, H>()) - Effect handlers (
register_effect::<E, H>())
Behavioral components are only active when the plugin passes the activation filter:
- Phase hooks (
register_phase_hook()) - Tools (
register_tool()) - Request transforms (
register_request_transform())
Activation is controlled by AgentSpec.active_hook_filter:
active_hook_filter value | Behavior |
|---|---|
| Empty (default) | All plugins’ behavioral components are active |
| Non-empty set | Only plugins whose ID is in the set contribute behavioral components |
This separation enables infrastructure plugins (state management, action handlers, effect handlers) to exist without impacting execution flow. A logging plugin that only registers an effect handler, for example, never needs to appear in active_hook_filter – its handler fires whenever any plugin emits the corresponding effect.
The filtering happens in PhaseRuntime::filter_hooks():
fn filter_hooks<'a>(env: &'a ExecutionEnv, ctx: &PhaseContext) -> Vec<&'a TaggedPhaseHook> {
let hooks = env.hooks_for_phase(ctx.phase);
let active_hook_filter = &ctx.agent_spec.active_hook_filter;
hooks
.iter()
.filter(|tagged| {
active_hook_filter.is_empty()
|| active_hook_filter.contains(&tagged.plugin_id)
})
.collect()
}
Hook Ordering and Conflict Resolution
Multiple plugins can register hooks for the same Phase. The engine runs them through a two-stage process implemented in gather_and_commit_hooks() (crates/awaken-runtime/src/phase/engine.rs).
Fast path: parallel execution, single commit
All hooks for a phase run in parallel against a frozen snapshot. Each hook receives the same snapshot and produces a StateCommand. If no hook writes to a key with MergeStrategy::Exclusive that another hook also writes to, all commands are merged and committed in a single batch.
flowchart LR
S[Frozen Snapshot] --> H1[Hook A]
S --> H2[Hook B]
S --> H3[Hook C]
H1 --> M[Merge All]
H2 --> M
H3 --> M
M --> C[Single Commit]
Conflict fallback: partition and serial retry
If two or more hooks write to the same Exclusive key, the engine detects the conflict and falls back:
- Partition – Walk commands in registration order. Greedily add each command to a “compatible batch” if its
Exclusivekeys do not overlap with keys already in the batch. Otherwise, defer the hook. - Commit the batch – The compatible batch is merged and committed.
- Serial re-execution – Deferred hooks are re-run one at a time, each against a fresh snapshot that includes the results of all prior commits.
flowchart TD
PAR[Run all hooks in parallel] --> CHECK{Exclusive key overlap?}
CHECK -- No --> FAST[Merge all, commit once]
CHECK -- Yes --> PART[Partition: batch + deferred]
PART --> CBATCH[Commit compatible batch]
CBATCH --> SERIAL[Re-run deferred hooks serially]
SERIAL --> DONE[All hooks committed]
Because hooks are pure functions (frozen snapshot in, StateCommand out, no side effects), re-execution on conflict is always safe. The deferred hooks see the updated state from the batch commit, so they produce correct results even when the original parallel execution would have conflicted.
See State and Snapshot Model for details on MergeStrategy and snapshot isolation.
Phase Convergence Loop
Each phase runs a GATHER then EXECUTE loop that converges when no new work remains.
GATHER stage
Run all active hooks in parallel (with conflict resolution as described above). Hooks produce StateCommand values that may contain:
- State mutations (key updates)
- Scheduled actions (to be processed in the EXECUTE stage)
- Effects (dispatched immediately after commit)
EXECUTE stage
Process pending scheduled actions whose Phase matches the current phase and whose action key has a registered handler. Each action handler runs against a fresh snapshot and produces its own StateCommand, which may schedule new actions for the same phase.
If new matching actions appear after processing, the loop repeats:
flowchart TD
START[Phase Entry] --> GATHER[GATHER: run hooks, commit]
GATHER --> EXEC[EXECUTE: process matching actions]
EXEC --> CHECK{New actions for this phase?}
CHECK -- Yes --> EXEC
CHECK -- No --> DONE[Phase complete]
The loop is bounded by DEFAULT_MAX_PHASE_ROUNDS (16). If the action count does not converge within this limit, the engine returns StateError::PhaseRunLoopExceeded. This prevents infinite reactive chains while allowing legitimate multi-step action cascades.
This convergence design enables reactive patterns: a permission check action can schedule a suspend action in the same BeforeToolExecute phase, and both are processed before the phase completes.
Request Transform Hooks
Plugins can register InferenceRequestTransform implementations via registrar.register_request_transform(). Transforms modify the InferenceRequest before it reaches the LLM executor.
Use cases:
- System prompt injection – append context, instructions, or reminders to the system message
- Tool list modification – filter, reorder, or augment the tool descriptors sent to the LLM
- Parameter overrides – adjust temperature, max tokens, or other inference parameters
Transforms run in registration order and are composable: each transform receives the request as modified by the previous transform.
Only active plugins’ transforms are applied. If active_hook_filter is non-empty, transforms from plugins not in the filter are skipped.
Effect Handlers
Effects are typed events defined via the EffectSpec trait:
pub trait EffectSpec: 'static + Send + Sync {
const KEY: &'static str;
type Payload: Serialize + DeserializeOwned + Send + Sync + 'static;
}
Hooks and action handlers emit effects by calling command.emit::<E>(payload) on their StateCommand. Unlike scheduled actions (which execute within a specific phase’s convergence loop), effects are dispatched after the command is committed to the store.
Plugins register effect handlers via registrar.register_effect::<E, H>(handler). When an effect is dispatched, the engine calls the handler with the effect payload and the current snapshot.
Key properties:
- Fire-and-forget – handler failures are logged but do not block execution or roll back the commit.
- Post-commit – effects see the state after the command that emitted them has been applied.
- Validated at submit time – if a command emits an effect with no registered handler,
submit_commandreturnsStateError::UnknownEffectHandlerimmediately, preventing silent drops.
Use cases: audit logging, metric emission, cross-plugin notification, external system synchronization.
InferenceOverride Merging
Multiple plugins can influence inference parameters by scheduling SetInferenceOverride actions. The InferenceOverride struct uses last-wins-per-field merge semantics:
pub struct InferenceOverride {
pub model: Option<String>,
pub fallback_models: Option<Vec<String>>,
pub temperature: Option<f64>,
pub max_tokens: Option<u32>,
pub top_p: Option<f64>,
pub reasoning_effort: Option<ReasoningEffort>,
}
When two overrides are merged, each field independently takes the last non-None value:
pub fn merge(&mut self, other: InferenceOverride) {
if other.temperature.is_some() {
self.temperature = other.temperature;
}
if other.max_tokens.is_some() {
self.max_tokens = other.max_tokens;
}
// ... same for all fields
}
This allows plugins to override specific parameters without affecting others. A cost-control plugin can set max_tokens while a quality plugin sets temperature, and neither interferes with the other. If both set the same field, the last merge wins.
Tool Intercept Priority
During BeforeToolExecute, plugins can schedule ToolInterceptAction to control tool execution flow. The action payload is one of three variants:
pub enum ToolInterceptPayload {
Block { reason: String },
Suspend(SuspendTicket),
SetResult(ToolResult),
}
When multiple intercepts are scheduled for the same tool call, they are resolved by implicit priority:
| Priority | Variant | Behavior |
|---|---|---|
| 3 (highest) | Block | Terminate tool execution, fail the call |
| 2 | Suspend | Pause execution, wait for external decision |
| 1 (lowest) | SetResult | Short-circuit with a predefined result |
The highest-priority intercept wins. If two intercepts have the same priority (e.g., two plugins both schedule Block), the first one processed takes effect and the conflict is logged as an error.
If no intercept is scheduled, the tool executes normally (implicit proceed).
flowchart TD
BTE[BeforeToolExecute hooks run] --> INT{Any intercepts scheduled?}
INT -- No --> EXEC[Execute tool normally]
INT -- Yes --> PRI[Resolve by priority]
PRI --> B[Block: fail the call]
PRI --> S[Suspend: wait for decision]
PRI --> R[SetResult: return predefined result]
See Also
- Tool and Plugin Boundary – when to use a tool vs a plugin
- Run Lifecycle and Phases – phase ordering and termination
- State and Snapshot Model – merge strategies, scoping, snapshot isolation
- Scheduled Actions Reference – action handler registration
- HITL and Mailbox – suspension and resume flow
Design Tradeoffs
This page documents key architectural decisions in Awaken and the tradeoffs they entail.
Snapshot Isolation vs Mutable State
Decision: Phase hooks read from an immutable Snapshot and write to a MutationBatch. Mutations are applied atomically after all hooks converge.
Alternative: Hooks mutate shared state directly (protected by locks or sequential execution).
| Snapshot Isolation | Mutable State | |
|---|---|---|
| Correctness | Hooks see consistent state regardless of execution order | Result depends on hook ordering and lock granularity |
| Concurrency | Hooks can run in parallel without data races | Requires careful lock management or forced sequencing |
| Complexity | Requires MutationBatch machinery, conflict detection, merge strategies | Simpler implementation, direct reads and writes |
| Debuggability | Each phase boundary is a clean state transition | State changes interleaved, harder to trace |
| Cost | Extra Arc clone per phase for snapshot creation | No snapshot overhead |
Why snapshot isolation: Hook execution order should not affect correctness. When multiple plugins touch state in the same phase, mutable state creates implicit ordering dependencies that are difficult to test and reason about. The snapshot approach makes each phase a pure function from state to mutations, which is easier to verify and replay.
Phase-Based Execution vs Event-Driven
Decision: Execution proceeds through a fixed sequence of phases (RunStart through RunEnd). Plugins register hooks at specific phases.
Alternative: Fully event-driven architecture where plugins subscribe to events and react asynchronously.
| Phase-Based | Event-Driven | |
|---|---|---|
| Predictability | Deterministic execution order within each phase | Non-deterministic ordering, race conditions possible |
| Plugin composition | Plugins interact at well-defined boundaries | Plugins interact through shared event bus, implicit coupling |
| Testability | Phase sequences are easy to unit test | Requires simulating async event flows |
| Flexibility | Adding behavior between phases requires new phases | New events can be added freely |
| Performance | Sequential phase execution adds overhead | Concurrent processing possible |
Why phase-based: Agent execution has a natural sequential structure (infer, execute tools, check termination). Phases formalize this structure and give plugins guaranteed execution points. Event-driven systems are more flexible but make it harder to reason about the order in which plugins see state and harder to implement features like “modify the inference request before it’s sent.”
Typed State Keys vs Dynamic State
Decision: State keys are Rust types implementing the StateKey trait with associated Value and Update types.
Alternative: Untyped key-value store (HashMap<String, Value>).
| Typed Keys | Dynamic State | |
|---|---|---|
| Type safety | Compile-time guarantees on value and update types | Runtime type errors |
| Merge semantics | MergeStrategy declared per key at compile time | Merge logic must be external or convention-based |
| Discoverability | Keys are types – IDE navigation, documentation | Keys are strings – grep-based discovery |
| Boilerplate | Each key requires a type definition | Just use a string |
| Extensibility | New keys require code changes and recompilation | New keys can be added dynamically at runtime |
Why typed keys: State correctness is critical in an agent runtime. A mistyped key or wrong value type causes subtle bugs that surface only during execution. Typed keys catch these at compile time. The apply function makes update semantics explicit – there is no ambiguity about how a counter is incremented or how a list is appended to.
Plugin System vs Middleware Chain
Decision: Plugins register through PluginRegistrar and declare hooks, state keys, tools, and effect handlers as separate registrations.
Alternative: Middleware chain where each layer wraps the next (like tower middleware or HTTP middleware stacks).
| Plugin System | Middleware Chain | |
|---|---|---|
| Granularity | Hooks at specific phases, tools, state keys, effects – each registered independently | Each middleware wraps the entire execution |
| Composition | Multiple plugins contribute hooks to the same phase | Middleware ordering determines behavior |
| Selective activation | active_hook_filter can enable/disable specific plugins per agent | Must restructure the chain to skip middleware |
| Complexity | More registration ceremony | Simpler mental model (wrap and delegate) |
| Cross-cutting concerns | Natural fit – each plugin handles one concern | Each middleware handles one concern but sees all traffic |
Why plugin system: Agent execution has many extension points that don’t nest cleanly. A permission check happens at BeforeToolExecute, observability spans wrap tool execution, reminders inject messages at AfterToolExecute. These are independent concerns at different phases. A middleware chain would require each middleware to understand the full lifecycle and decide when to act. The plugin system lets each plugin declare exactly which phases it cares about.
Multi-Protocol Server vs Single Protocol
Decision: The server exposes AI SDK v6, AG-UI, A2A, and MCP over HTTP, while ACP exists as a separate stdio protocol module. Each protocol adapter translates AgentEvent into its wire format.
Alternative: Support a single canonical protocol and require clients to adapt.
| Multi-Protocol | Single Protocol | |
|---|---|---|
| Frontend compatibility | Works with AI SDK, CopilotKit, A2A clients, and MCP HTTP clients out of the box | Requires custom adapter on the client side |
| Maintenance | Each protocol adapter must be kept in sync with AgentEvent changes | One adapter to maintain |
| Testing | Protocol parity tests ensure all adapters handle all events | Less test surface |
| Complexity | Multiple route sets, encoder types, and event mappings | One route set, one encoder |
| Runtime coupling | Runtime is protocol-independent – only emits AgentEvent | Runtime could be coupled to the protocol |
Why multi-protocol: The AI agent ecosystem has not converged on a single protocol. AI SDK, AG-UI, and A2A serve different use cases (chat frontends, copilot UIs, agent-to-agent communication). Supporting multiple protocols at the server layer avoids forcing protocol choices on users. The Transcoder trait and stateful encoders keep the runtime decoupled – adding a new protocol means implementing one encoder, not modifying the execution engine.
See Also
- Architecture – three-layer design
- State and Snapshot Model – snapshot isolation details
- Run Lifecycle and Phases – phase execution model
- Tool and Plugin Boundary – plugin vs tool design
Glossary
| Term | 中文 | Description |
|---|---|---|
Thread | 会话线程 | Persisted conversation + state history. |
Run | 运行 | One execution attempt over a thread. |
Phase | 阶段 | Named step in the execution loop (RunStart, StepStart, BeforeInference, AfterInference, BeforeToolExecute, AfterToolExecute, StepEnd, RunEnd). |
Snapshot | 快照 | Immutable state snapshot (struct Snapshot { revision: u64, ext: Arc<StateMap> }) seen by phase hooks. |
StateKey | 状态键 | Typed key with scope, merge strategy, and value type. |
MutationBatch | 变更批次 | Collected state mutations applied atomically after phase convergence. |
AgentRuntime | 智能体运行时 | Orchestration layer for agent resolution and run execution. |
AgentSpec | 智能体规约 | Serializable agent definition with model, prompt, tools, plugins. |
AgentEvent | 智能体事件 | Canonical runtime stream event. |
Plugin | 插件 | System-level lifecycle hook registered via PluginRegistrar. |
PluginRegistrar | 插件注册器 | Registration API for phase hooks, tools, state keys, handlers. |
PhaseHook | 阶段钩子 | Async hook executed during a specific phase. |
PhaseContext | 阶段上下文 | Immutable context passed to phase hooks. |
StateCommand | 状态命令 | Result of a phase hook: mutations + scheduled actions + effects. |
Tool | 工具 | User-facing capability with descriptor, validation, and execution. |
ToolDescriptor | 工具描述符 | Tool metadata: id, name, description, parameters schema. |
ToolResult | 工具结果 | Tool execution result: success, error, or suspended. |
ToolCallContext | 工具调用上下文 | Execution context with state access, identity, and activity reporting. |
TerminationReason | 终止原因 | Why a run ended (NaturalEnd, Stopped, Error, etc.). |
SuspendTicket | 挂起票据 | Suspension payload with pending call and resume mode. |
MailboxJob | 邮箱任务 | Durable job entry for async/HITL workflows. |
RunRequest | 运行请求 | Input to start a run: messages, thread ID, agent ID. |
MergeStrategy | 合并策略 | How parallel state mutations are reconciled: Exclusive (conflict = error) or Commutative (order-independent). |
KeyScope | 键作用域 | Lifetime of a state key: Run (per-execution) or Thread (persisted across runs). |
StateMap | 状态映射 | Type-safe heterogeneous map backing Snapshot. |
RunStatus | 运行状态 | Coarse run status: Running, Waiting, Done. |
ToolCallStatus | 工具调用状态 | Per-call status: New, Running, Suspended, Resuming, Succeeded, Failed, Cancelled. |
ResolvedAgent | 已解析智能体 | Agent fully resolved from registries with config, tools, plugins, and executor. |
AgentResolver | 智能体解析器 | Trait that resolves an agent spec ID into a ResolvedAgent. |
BuildError | 构建错误 | Error from AgentRuntimeBuilder when validation or wiring fails. |
RuntimeError | 运行时错误 | Error during agent loop execution. |
InferenceOverride | 推理覆盖 | Per-request override for model, temperature, max_tokens, reasoning_effort. |
ContextWindowPolicy | 上下文窗口策略 | Token budget and compaction rules for context management. |
StreamEvent | 流事件 | Envelope wrapping AgentEvent with sequence number and timestamp. |
TokenUsage | 令牌用量 | Token consumption report from LLM inference. |
ExecutionEnv | 执行环境 | Assembled runtime environment holding hooks, handlers, tools, and plugins. |
CommitHook | 提交钩子 | Hook invoked when state is committed to storage. |
FAQ
Which LLM providers are supported?
Any provider compatible with genai. This includes OpenAI, Anthropic, DeepSeek, Google Gemini, Ollama, and others. Configure the provider via model ID string in AgentSpec or AgentConfig. The GenaiExecutor handles provider routing based on the model prefix.
How do I add a new storage backend?
Implement the ThreadRunStore trait from awaken-contract. The trait requires methods for loading and saving threads, runs, and checkpoints. See InMemoryStore and FileStore in awaken-stores for reference implementations. Pass your store to AgentRuntime::new().with_thread_run_store(store) or AgentRuntimeBuilder::new().with_thread_run_store(store).
Can I use awaken without the server?
Yes. AgentRuntime is a standalone library type. Create a runtime, build a RunRequest, and call runtime.run(request, sink) directly. The server crate (awaken-server) is an optional HTTP/SSE gateway layered on top.
How do I run multiple agents?
Two approaches:
- Delegates: Define delegate agent IDs in the parent
AgentSpec. The runtime handles handoff viaActiveAgentIdKeyat step boundaries. - A2A protocol: Register remote agents via
AgentRuntimeBuilder::with_remote_agents(). Remote agents are discovered and invoked over HTTP using the Agent-to-Agent protocol.
What is the difference between Run scope and Thread scope?
- Run scope: State exists only for the duration of a single run. Cleared when the run ends. Use for transient data like step counters, token budgets, and per-run configuration.
- Thread scope: State persists across runs within the same thread. Use for conversation memory, user preferences, and accumulated context.
Scope is declared when defining a StateKey.
How do I handle tool errors?
Return ToolResult::error(tool_name, message) from your tool’s execute method. The runtime writes the error result back to the conversation as a tool response message and continues the inference loop. The LLM sees the error and can retry or adjust its approach. For fatal errors that should stop the run, return a ToolError instead.
Can tools run in parallel?
Yes. Configure ToolExecutionMode in the agent spec. When set to parallel mode, the runtime executes independent tool calls concurrently. Results are collected and merged before proceeding to the next inference step.
How do I debug a run that is stuck?
Check RunStatus in state (__runtime.run_lifecycle key). If Waiting, look at __runtime.tool_call_states for pending decisions. If Running, check if max_rounds or timeout was hit. Enable observability plugin to get per-phase tracing.
How do I test without a real LLM?
Implement LlmExecutor with canned responses. See Testing Strategy for patterns.
What happens when parallel tools write to the same state key?
If the key uses MergeStrategy::Exclusive, the merge fails and hooks fall back to serial execution. Use MergeStrategy::Commutative for keys that support concurrent writes. See State and Snapshot Model.
How do request transforms work?
Plugins register InferenceRequestTransform via the registrar. Transforms modify the inference request (system prompt, tools, parameters) before it reaches the LLM. Only active plugins’ transforms apply. See Plugin Internals.
Can I write a custom storage backend?
Yes. Implement ThreadRunStore (for state persistence) and optionally MailboxStore (for HITL). See trait definitions in awaken-stores. File and PostgreSQL implementations serve as references.
How does context compaction work?
When autocompact_threshold: Option<usize> is set in ContextWindowPolicy, the CompactionPlugin monitors token usage. When the context exceeds that threshold, it finds a safe compaction boundary (where all tool call/result pairs are complete), summarizes older messages via LLM, and replaces them with a <conversation-summary> message. See Optimize Context Window.
How do I choose between AI SDK v6, AG-UI, and A2A protocols?
- AI SDK v6: Best for React frontends using Vercel AI SDK. Supports text streaming, tool calls, and state snapshots.
- AG-UI: Best for CopilotKit frontends. Supports generative UI components and agent collaboration.
- A2A: Best for agent-to-agent communication. Used for delegate agents and inter-service orchestration.
All three run over HTTP/SSE. Choose based on your frontend framework.
Migrating from Tirea
Awaken is a ground-up rewrite of tirea, not an incremental upgrade. This guide maps tirea 0.5 concepts to their awaken equivalents for developers porting existing agents.
The tirea 0.5 source is archived on the
tirea-0.5 branch.
Crate Mapping
| tirea 0.5 | awaken | Notes |
|---|---|---|
tirea-state | awaken-contract | State engine merged into contract crate |
tirea-state-derive | — | Removed; use StateKey trait directly |
tirea-contract | awaken-contract | Merged types + traits into one crate |
tirea-agentos | awaken-runtime | Renamed; same role (execution engine) |
tirea-store-adapters | awaken-stores | Renamed |
tirea-agentos-server | awaken-server | Renamed; protocols now inline |
tirea-protocol-ai-sdk-v6 | awaken-server::protocols::ai_sdk_v6 | Merged into server crate |
tirea-protocol-ag-ui | awaken-server::protocols::ag_ui | Merged into server crate |
tirea-protocol-acp | awaken-server::protocols::acp | Merged into server crate |
tirea-extension-permission | awaken-ext-permission | Renamed |
tirea-extension-observability | awaken-ext-observability | Renamed |
tirea-extension-skills | awaken-ext-skills | Renamed |
tirea-extension-mcp | awaken-ext-mcp | Renamed |
tirea-extension-handoff | awaken-runtime::extensions::handoff | Merged into runtime |
tirea-extension-a2ui | awaken-ext-generative-ui | Renamed |
tirea (facade) | awaken | Renamed |
Key structural change: tirea had 16 crates; awaken has 13. Protocol crates
were merged into awaken-server, the state derive macro was removed, and
tirea-state + tirea-contract were unified into awaken-contract.
State Model
tirea used a derive-macro-based state system. Awaken uses a trait-based approach.
// tirea 0.5
#[derive(StateSlot)]
#[state(key = "my_plugin.counter", scope = "run")]
struct Counter(u32);
// awaken
use awaken::prelude::*;
pub struct Counter;
impl StateKey for Counter {
const KEY: &'static str = "my_plugin.counter";
const MERGE: MergeStrategy = MergeStrategy::Exclusive;
type Value = u32;
type Update = u32;
fn apply(value: &mut u32, update: u32) {
*value = update;
}
}
| tirea concept | awaken equivalent |
|---|---|
StateSlot derive | StateKey trait impl |
ScopeDomain::Run | KeyScope::Run (default) |
ScopeDomain::Thread | KeyScope::Thread |
ScopeDomain::Global | Removed (use KeyScope::Thread + ProfileStore) |
state.get::<T>() | snapshot.get::<T>() (via ctx.state::<T>() in hooks) |
state.set(value) | cmd.update::<T>(value) (via StateCommand) |
| Mutable state access | Immutable snapshots + StateCommand returns |
Key change: State is never mutated directly. Hooks read from immutable
snapshots and return StateCommand with updates. The runtime applies all
updates atomically after the gather phase.
Action System
tirea used typed action enums per phase. Awaken uses ScheduledActionSpec with
handlers.
// tirea 0.5
BeforeInferenceAction::AddContextMessage(
ContextMessage::system("key", "content")
)
// awaken
use awaken::prelude::*;
let mut cmd = StateCommand::new();
cmd.schedule_action::<AddContextMessage>(
ContextMessage::system("key", "content"),
)?;
| tirea action | awaken action | Notes |
|---|---|---|
BeforeInferenceAction::AddContextMessage | AddContextMessage | Same semantics |
BeforeInferenceAction::SetInferenceOverride | SetInferenceOverride | Same semantics |
BeforeInferenceAction::ExcludeTool | ExcludeTool | Same semantics |
BeforeToolExecuteAction::Block | ToolInterceptAction with Block payload | Unified intercept |
BeforeToolExecuteAction::Suspend | ToolInterceptAction with Suspend payload | Unified intercept |
AfterToolExecuteAction::AddMessage | AddContextMessage | Generalized |
See Scheduled Actions for the full list.
Plugin Trait
// tirea 0.5
impl Extension for MyPlugin {
fn name(&self) -> &str { "my_plugin" }
fn register(&self, ctx: &mut ExtensionContext) -> Result<()> {
ctx.add_hook(Phase::BeforeInference, MyHook);
Ok(())
}
}
// awaken
impl Plugin for MyPlugin {
fn descriptor(&self) -> PluginDescriptor {
PluginDescriptor { name: "my_plugin" }
}
fn register(&self, r: &mut PluginRegistrar) -> Result<(), StateError> {
r.register_phase_hook("my_plugin", Phase::BeforeInference, MyHook)?;
r.register_key::<MyState>(StateKeyOptions::default())?;
Ok(())
}
}
| tirea | awaken | Notes |
|---|---|---|
Extension trait | Plugin trait | Renamed |
ExtensionContext | PluginRegistrar | More structured; explicit key/tool/hook registration |
ctx.add_hook() | r.register_phase_hook() | Requires plugin_id parameter |
| — | r.register_key::<T>() | New: state keys must be declared at registration |
| — | r.register_tool() | New: plugin-scoped tools |
| — | r.register_scheduled_action::<A, _>(handler) | New: custom action handlers |
Tool Trait
// tirea 0.5
#[async_trait]
impl TypedTool for MyTool {
type Args = MyArgs;
async fn call(&self, args: Self::Args, ctx: ToolContext) -> ToolOutput {
ToolOutput::success(json!({"result": 42}))
}
}
// awaken
#[async_trait]
impl Tool for MyTool {
fn descriptor(&self) -> ToolDescriptor { ... }
async fn execute(&self, args: Value, ctx: &ToolCallContext) -> Result<ToolOutput, ToolError> {
Ok(ToolResult::success("my_tool", json!({"result": 42})).into())
}
}
| tirea | awaken | Notes |
|---|---|---|
TypedTool with type Args | Tool with Value args | No generic Args; validate in execute() |
ToolOutput (direct return) | ToolOutput (result + optional StateCommand) | New: tools can schedule actions |
ToolContext | ToolCallContext | Renamed; adds state(), profile_access() |
Runtime Builder
// tirea 0.5
let runtime = AgentOsBuilder::new()
.agent(agent_spec)
.tool("search", SearchTool)
.extension(PermissionExtension::new(rules))
.build()?;
// awaken
let runtime = AgentRuntimeBuilder::new()
.with_agent_spec(agent_spec)
.with_tool("search", Arc::new(SearchTool))
.with_plugin("permissions", Arc::new(PermissionPlugin::new(rules)))
.build()?;
| tirea | awaken | Notes |
|---|---|---|
AgentOsBuilder | AgentRuntimeBuilder | Renamed |
.agent(spec) | .with_agent_spec(spec) | Consistent with_ prefix |
.tool(id, impl) | .with_tool(id, Arc<dyn Tool>) | Explicit Arc wrapping |
.extension(ext) | .with_plugin(id, Arc<dyn Plugin>) | Renamed; requires ID |
.build() | .build() | Same |
Concepts Removed
The following tirea concepts have no awaken equivalent:
| tirea concept | Reason removed |
|---|---|
StateSlot derive macro | Trait impl is simpler and doesn’t require proc-macro |
Global scope | Thread scope + ProfileStore covers this |
RuntimeEffect | Replaced by StateCommand effects |
EffectLog / ScheduledActionLog | Replaced by tracing |
ConfigStore / ConfigSlot | Replaced by AgentSpec sections |
AgentProfile | Merged into AgentSpec |
ExtensionContext live activation | Replaced by static plugin_ids on AgentSpec |
Concepts Added
| awaken concept | Description |
|---|---|
PluginRegistrar | Structured registration (keys, tools, hooks, actions) |
ToolOutput with StateCommand | Tools can schedule actions as side-effects |
ToolInterceptAction | Unified Block/Suspend/SetResult pipeline |
CircuitBreaker | Per-model LLM failure protection |
Mailbox | Durable job queue with lease-based claim |
EventReplayBuffer | SSE reconnection with frame replay |
DeferredToolsPlugin | Lazy tool loading with probability model |
ProfileStore | Cross-session persistent state |
Quick Checklist
- Replace
tireadependency withawakeninCargo.toml - Replace
use tirea::*withuse awaken::prelude::* - Convert
#[derive(StateSlot)]toimpl StateKey for ... - Convert
Extensionimpls toPluginimpls withPluginRegistrar - Convert
TypedToolimpls toToolimpls - Replace action enum variants with
cmd.schedule_action::<ActionType>(...) - Replace
AgentOsBuilderwithAgentRuntimeBuilder - Update store imports:
tirea_store_adapters→awaken_stores - Update server imports: protocol crates →
awaken_server::protocols::*