docs: fold EKG draft into README/README_EN and remove standalone draft design doc

This commit is contained in:
DBT
2026-03-01 05:38:48 +00:00
parent 6608456fbf
commit 11a0005645
3 changed files with 30 additions and 97 deletions

View File

@@ -67,6 +67,21 @@
- 未显式声明时,系统会从任务文本自动推断资源键。
- 冲突任务进入 `resource_lock` 等待,默认 30 秒后重试抢锁,并带公平加权(等待越久优先级越高)。
- 自治完成/阻塞通知不再使用 `autonomy.notify_channel` / `autonomy.notify_chat_id`;默认自动从已启用通道的 `allow_from` 推导目标(优先 Telegram
- 入站消息去重:基于 `message_id` 进行通道级去重(默认 TTL 10 分钟),避免平台重试导致重复回复。
### EKGExecution Knowledge Graph
ClawGo 现已内置执行知识图谱能力(轻量 JSONL 事件流,不依赖外部图数据库):
- 事件存储:`memory/ekg-events.jsonl`
- 错误签名归一化(路径/数字/hex 去噪)
- 自治重复错误抑制(`ekg_consecutive_error_threshold`
- provider fallback 按历史效果排序(含 errsig-aware
- 任务审计支持 provider/model 可观测
- EKG 统计按 source/channel 分层heartbeat 与 workload 分离)
> 为什么需要时间窗口:
> 历史全量统计会被旧数据与 heartbeat 噪音稀释,导致当前阶段决策失真。建议默认观察近 24h或 6h/7d 可切换),让 fallback 和告警更贴近“当前”系统状态。
## 🏁 快速开始

View File

@@ -67,6 +67,21 @@ Autonomy now supports lock scheduling via `resource_keys`. You can explicitly de
- Without explicit keys, the engine derives keys from task text heuristically.
- Conflicting tasks enter `resource_lock` waiting, retry lock acquisition after 30s, and use fairness weighting (longer wait => higher scheduling priority).
- Autonomy completion/blocked notifications no longer use `autonomy.notify_channel` / `autonomy.notify_chat_id`; target is derived from enabled channel `allow_from` (Telegram first).
- Inbound dedupe: channel-level dedupe by `message_id` (default TTL: 10 minutes) to avoid duplicate replies from platform retries.
### EKG (Execution Knowledge Graph)
ClawGo now includes a built-in execution knowledge graph (lightweight JSONL event stream; no external graph DB required):
- Event store: `memory/ekg-events.jsonl`
- Normalized error signatures (path/number/hex denoise)
- Repeated-error suppression for autonomy (`ekg_consecutive_error_threshold`)
- Provider fallback ranking by historical outcomes (errsig-aware)
- Task-audit visibility for provider/model
- Source/channel-stratified EKG stats (heartbeat separated from workload)
> Why time windows matter:
> Full-history stats get diluted by stale data and heartbeat noise, which degrades current decisions. A recent window (e.g., 24h, optionally 6h/7d) keeps fallback and alerts aligned with present runtime behavior.
## 🏁 Quick Start

View File

@@ -1,97 +0,0 @@
# EKG 设计稿Execution Knowledge Graph
> 目标:在不引入重型图数据库的前提下,为 ClawGo 提供“可审计、可回放、可降错”的执行知识图谱能力,优先降低 agent 重复报错与自治死循环。
## 1. 范围与阶段
### M1本次实现
- 记录执行结果事件(成功/失败/抑制)到 `memory/ekg-events.jsonl`
- 对错误文本做签名归一化errsig
- 在自治引擎中读取 advice同任务同 errsig 连续失败达到阈值时,直接阻断重试(避免死循环)
### M2后续
- provider/model/tool 维度的成功率建议preferred / banned
- channel/source 维度的策略分层
### M3后续
- WAL + 快照snapshot
- WebUI 可视化errsig 热点、抑制命中率)
---
## 2. 数据模型(接口草图)
```go
type Event struct {
Time string `json:"time"`
TaskID string `json:"task_id,omitempty"`
Session string `json:"session,omitempty"`
Channel string `json:"channel,omitempty"`
Source string `json:"source,omitempty"`
Status string `json:"status"` // success|error|suppressed
ErrSig string `json:"errsig,omitempty"`
Log string `json:"log,omitempty"`
}
type Advice struct {
ShouldEscalate bool `json:"should_escalate"`
RetryBackoffSec int `json:"retry_backoff_sec"`
Reason []string `json:"reason"`
}
type SignalContext struct {
TaskID string
ErrSig string
Source string
Channel string
}
```
---
## 3. 存储与性能
- 存储:`memory/ekg-events.jsonl`append-only
- 读取:仅扫描最近窗口(默认 2000 行)
- 复杂度O(N_recent)
- 设计取舍M1 以正确性优先,后续再加入 snapshot 与索引
---
## 4. 规则M1
- 错误签名归一化:
- 路径归一化 `<path>`
- 数字归一化 `<n>`
- hex 归一化 `<hex>`
- 空白压缩
- 阈值规则:
-`task_id + errsig` 连续 `>=3` 次 error
- `ShouldEscalate=true`,自治任务进入 `blocked:repeated_error_signature`
---
## 5. 接入点
1) `pkg/agent/loop.go`
-`appendTaskAuditEvent` 处同步写入 EKG 事件(与 task-audit 同步)
2) `pkg/autonomy/engine.go`
- 在运行结果为 error 的分支读取 EKG advice
- 命中升级条件时,直接阻断重试并标记 block reason
---
## 6. 风险与回滚
- 风险:阈值过低导致过早阻断
- 缓解:默认阈值 3且仅在同 task+同 errsig 命中时触发
- 回滚:移除 advice 判断即可恢复原重试路径
---
## 7. 验收标准M1
- 能生成并追加 `memory/ekg-events.jsonl`
- 相同任务在相同错误签名下连续失败 3 次后,自治不再继续循环 dispatch
- `make test`Docker compile通过