Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Design Tradeoffs

This page documents key architectural decisions in Awaken and the tradeoffs they entail.

Snapshot Isolation vs Mutable State

Decision: Phase hooks read from an immutable Snapshot and write to a MutationBatch. Mutations are applied atomically after all hooks converge.

Alternative: Hooks mutate shared state directly (protected by locks or sequential execution).

Snapshot IsolationMutable State
CorrectnessHooks see consistent state regardless of execution orderResult depends on hook ordering and lock granularity
ConcurrencyHooks can run in parallel without data racesRequires careful lock management or forced sequencing
ComplexityRequires MutationBatch machinery, conflict detection, merge strategiesSimpler implementation, direct reads and writes
DebuggabilityEach phase boundary is a clean state transitionState changes interleaved, harder to trace
CostExtra Arc clone per phase for snapshot creationNo snapshot overhead

Why snapshot isolation: Hook execution order should not affect correctness. When multiple plugins touch state in the same phase, mutable state creates implicit ordering dependencies that are difficult to test and reason about. The snapshot approach makes each phase a pure function from state to mutations, which is easier to verify and replay.

Phase-Based Execution vs Event-Driven

Decision: Execution proceeds through a fixed sequence of phases (RunStart through RunEnd). Plugins register hooks at specific phases.

Alternative: Fully event-driven architecture where plugins subscribe to events and react asynchronously.

Phase-BasedEvent-Driven
PredictabilityDeterministic execution order within each phaseNon-deterministic ordering, race conditions possible
Plugin compositionPlugins interact at well-defined boundariesPlugins interact through shared event bus, implicit coupling
TestabilityPhase sequences are easy to unit testRequires simulating async event flows
FlexibilityAdding behavior between phases requires new phasesNew events can be added freely
PerformanceSequential phase execution adds overheadConcurrent processing possible

Why phase-based: Agent execution has a natural sequential structure (infer, execute tools, check termination). Phases formalize this structure and give plugins guaranteed execution points. Event-driven systems are more flexible but make it harder to reason about the order in which plugins see state and harder to implement features like “modify the inference request before it’s sent.”

Typed State Keys vs Dynamic State

Decision: State keys are Rust types implementing the StateKey trait with associated Value and Update types.

Alternative: Untyped key-value store (HashMap<String, Value>).

Typed KeysDynamic State
Type safetyCompile-time guarantees on value and update typesRuntime type errors
Merge semanticsMergeStrategy declared per key at compile timeMerge logic must be external or convention-based
DiscoverabilityKeys are types – IDE navigation, documentationKeys are strings – grep-based discovery
BoilerplateEach key requires a type definitionJust use a string
ExtensibilityNew keys require code changes and recompilationNew keys can be added dynamically at runtime

Why typed keys: State correctness is critical in an agent runtime. A mistyped key or wrong value type causes subtle bugs that surface only during execution. Typed keys catch these at compile time. The apply function makes update semantics explicit – there is no ambiguity about how a counter is incremented or how a list is appended to.

Plugin System vs Middleware Chain

Decision: Plugins register through PluginRegistrar and declare hooks, state keys, tools, and effect handlers as separate registrations.

Alternative: Middleware chain where each layer wraps the next (like tower middleware or HTTP middleware stacks).

Plugin SystemMiddleware Chain
GranularityHooks at specific phases, tools, state keys, effects – each registered independentlyEach middleware wraps the entire execution
CompositionMultiple plugins contribute hooks to the same phaseMiddleware ordering determines behavior
Selective activationactive_hook_filter can enable/disable specific plugins per agentMust restructure the chain to skip middleware
ComplexityMore registration ceremonySimpler mental model (wrap and delegate)
Cross-cutting concernsNatural fit – each plugin handles one concernEach middleware handles one concern but sees all traffic

Why plugin system: Agent execution has many extension points that don’t nest cleanly. A permission check happens at BeforeToolExecute, observability spans wrap tool execution, reminders inject messages at AfterToolExecute. These are independent concerns at different phases. A middleware chain would require each middleware to understand the full lifecycle and decide when to act. The plugin system lets each plugin declare exactly which phases it cares about.

Multi-Protocol Server vs Single Protocol

Decision: The server exposes AI SDK v6, AG-UI, A2A, and MCP over HTTP, while ACP exists as a separate stdio protocol module. Each protocol adapter translates AgentEvent into its wire format.

Alternative: Support a single canonical protocol and require clients to adapt.

Multi-ProtocolSingle Protocol
Frontend compatibilityWorks with AI SDK, CopilotKit, A2A clients, and MCP HTTP clients out of the boxRequires custom adapter on the client side
MaintenanceEach protocol adapter must be kept in sync with AgentEvent changesOne adapter to maintain
TestingProtocol parity tests ensure all adapters handle all eventsLess test surface
ComplexityMultiple route sets, encoder types, and event mappingsOne route set, one encoder
Runtime couplingRuntime is protocol-independent – only emits AgentEventRuntime could be coupled to the protocol

Why multi-protocol: The AI agent ecosystem has not converged on a single protocol. AI SDK, AG-UI, and A2A serve different use cases (chat frontends, copilot UIs, agent-to-agent communication). Supporting multiple protocols at the server layer avoids forcing protocol choices on users. The Transcoder trait and stateful encoders keep the runtime decoupled – adding a new protocol means implementing one encoder, not modifying the execution engine.

See Also