跳转到内容

启用可观测性

当你需要用 OpenTelemetry 兼容的遥测方式追踪 LLM 推理和工具执行时,使用本页。

  • 已有可运行的 awaken runtime
  • awaken 启用了 observability
  • 如果要导出 OTel:awaken-ext-observability 需要启用 otel,并准备好 collector
[dependencies]
awaken = { git = "https://github.com/AwakenWorks/awaken", features = ["observability"] }
tokio = { version = "1", features = ["full"] }
  1. 先用内存 sink(开发环境):
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, InMemorySink};
use awaken::registry_spec::ModelSpec;
use awaken::registry_spec::AgentSpec;
use awaken::{AgentRuntimeBuilder, Plugin};
let sink = InMemorySink::new();
let obs_plugin = ObservabilityPlugin::new(sink.clone());
let mut agent_spec = AgentSpec::new("observed-agent")
.with_model_id("gpt-4o-mini")
.with_system_prompt("You are a helpful assistant.")
.with_hook_filter("observability");
agent_spec.plugin_ids.push("observability".into());
let runtime = AgentRuntimeBuilder::new()
.with_provider("openai", Arc::new(GenaiExecutor::new()))
.with_model(ModelSpec::new("gpt-4o-mini", "openai", "gpt-4o-mini"))
.with_agent_spec(agent_spec)
.with_plugin("observability", Arc::new(obs_plugin) as Arc<dyn Plugin>)
.build()
.expect("failed to build runtime");

plugin_ids 负责加载 observability 插件;with_hook_filter("observability") 在同一个 agent 加载多个插件时保留它的 phase hook。

run 结束后,可以直接读取 sink.metrics()

  1. 换成 OTel sink(生产环境):
use std::sync::Arc;
use awaken::engine::GenaiExecutor;
use awaken::ext_observability::{ObservabilityPlugin, OtelMetricsSink};
use awaken::registry_spec::ModelSpec;
use awaken::registry_spec::AgentSpec;
use awaken::{AgentRuntimeBuilder, Plugin};
use opentelemetry_sdk::trace::SdkTracerProvider;
let provider = SdkTracerProvider::builder().build();
let tracer = provider.tracer("awaken");
let obs_plugin = ObservabilityPlugin::new(OtelMetricsSink::new(tracer));
  1. 如果需要,也可以实现自定义 sink:
use awaken::ext_observability::{MetricsSink, GenAISpan, ToolSpan, AgentMetrics};
struct MySink;
impl MetricsSink for MySink {
fn on_inference(&self, span: &GenAISpan) {}
fn on_tool(&self, span: &ToolSpan) {}
fn on_run_end(&self, metrics: &AgentMetrics) {}
}
  1. 插件会在这些 phase 采集数据:
Phase采集内容
RunStartsession 起始时间
BeforeInferencemodel、provider、开始时间
AfterInferencetoken usage、finish reason、耗时
BeforeToolExecutetool 开始时间
AfterToolExecutetool 耗时、失败状态;启用 ToolIoCapture 时可记录 tool 参数/结果
RunEndsession 总时长

OTel span 遵循 GenAI 语义约定:根 agent span 使用 gen_ai.operation.name=invoke_agent;推理 span 使用 gen_ai.provider.namegen_ai.request.modelgen_ai.conversation.id、token usage 等属性;tool span 使用 gen_ai.operation.name=execute_toolgen_ai.tool.* 属性。

  1. InMemorySink 跑一个 agent
  2. 执行结束后调用 sink.metrics()
  3. 确认 inferences 非空且 token 统计有值
  4. 如果用 OTel,去 collector / Jaeger 确认 span 已上报
错误原因修复
metrics 全是 0插件没注册通过 builder 注册 ObservabilityPlugin
找不到 OtelMetricsSink缺少 otel featureawaken-ext-observabilityotel
collector 里没有 spanexporter 没配置或 tracer provider 被提前释放检查 exporter 和 provider 生命周期
token 统计缺失provider 没返回 usage确保 LlmExecutor 产生 TokenUsage
  • crates/awaken-ext-observability/tests/
  • crates/awaken-ext-observability/src/lib.rs
  • crates/awaken-ext-observability/src/plugin/plugin.rs
  • crates/awaken-ext-observability/src/plugin/hooks.rs
  • crates/awaken-ext-observability/src/metrics.rs
  • crates/awaken-ext-observability/src/sink.rs
  • crates/awaken-ext-observability/src/otel.rs