Integrate master agent runtime orchestration updates

2026-04-16 04:41:46 +08:00
parent e0c0ea1814
commit 39be49630f
81 changed files with 9283 additions and 448 deletions
--- a/docs/superpowers/specs/2026-04-14-android-chat-status-row-design.md
+++ b/docs/superpowers/specs/2026-04-14-android-chat-status-row-design.md
@@ -0,0 +1,36 @@
+# Android chat status row spec
+
+## Goal
+Surface `conversationTasks` and `executionWarnings` as a single compact status row under each Android chat bubble so owners can see backend work without stacking multiple cards, while keeping realtime patches minimal.
+
+## Motivation
+- The current Android chat preview renders separate warning cards below every message and keeps an expensive list of `executionWarnings` separately, which duplicates the tap area and height of the chat list.
+- The Web page already summarizes tasks and warnings inline: we should match that on Android while keeping real-time patches restrictive, only rejiggering views for the affected message.
+- Requirements from the user: group warnings per message, deduplicate statuses, update only affected message on realtime patches, no web/backend changes.
+
+## Architecture
+1. Extend `BossUi.buildMessageBubble` (or its wrapper) to accept a reusable compact status row view that can show a grouped warning count and task status.
+2. When building a message view in `ProjectDetailActivity.buildMessageView`, gather all conversation task summaries and execution warnings that match the message ID. Deduplicate them by warning ID or Task ID and render them inside the status row.
+3. Create helper methods to compute a `StatusRowView` (a `LinearLayout`) that displays:
+   - If there is an active `conversationTask`, show a row with the backend status label and session/task info.
+   - If there are warnings, show badge summaries with title and summary text but limit to one line (maybe ellipsized) and collapse duplicates by grouping warnings that share the same title+summary pair.
+4. `buildMessageView` will add the status row view directly under the existing bubble but before any other attachments.
+5. The realtime warning patch logic already replaces a specific `messageId` view; ensure it reconstructs the same minimal status row and only rerenders when the status content changes. To do that, we may need to compute a canonical string for the combined task/warnings and store/compare to avoid redundant replacements.
+
+## Implementation details
+- Introduce new helper methods in `ProjectDetailActivity`:
+  - `private LinearLayout buildStatusRow(JSONObject message, List<ConversationTaskSummary> tasks, JSONArray warnings)`
+  - `private boolean hasStatusChangesForMessage(...)` to allow realtime patch to compare old vs new statuses without rerender unless necessary.
+- `BossUi` may need a new method such as `public static LinearLayout buildMessageStatusRow(Context context, String label, String detail, int tintColor)` to standardize the UI.
+- Update `ProjectDetailActivity.appendContent(buildMessageView(message))` loops to clear prior warning cards and add the new status row once.
+- Keep logic around `findExecutionWarningForMessage` but adjust to return grouped data instead of each warning separately.
+- Update `currentRenderedProjectPayload` handling to consider the grouped warning/task text, ensuring `hasSameExecutionWarningForMessage` returns `true` when summaries and counts match.
+
+## Testing approach (TDD)
+1. Add a new or expand existing Android tests that read `ProjectDetailActivity.java` source to assert the presence of status row construction, grouped warnings, and realtime patch logic (e.g., `tests/android-chat-incremental-realtime-append.test.ts` and `tests/android-chat-local-realtime-patch.test.ts`).
+2. Write a failing test first that expects a new helper like `buildMessageStatusRow` or `getStatusTextForMessage` to exist and to be used under each message.
+3. Update tests referencing warning cards to expect the new inline status row and ensure patch logic still references the helper to rerender targeted messages only when necessary.
+4. Manually run `npm test android-chat-incremental-realtime-append.test.ts` and `npm test android-chat-local-realtime-patch.test.ts` after implementing.
+
+## Validation
+- After receiving user approval on this spec, proceed with the implementation plan using TDD. Use test file names to focus runs, and keep the Android changes confined to `ProjectDetailActivity.java`, `BossUi.java`, and the corresponding tests.
--- a/docs/superpowers/specs/2026-04-14-hermes-backend-mvp-design.md
+++ b/docs/superpowers/specs/2026-04-14-hermes-backend-mvp-design.md
@@ -0,0 +1,271 @@
+# Boss `HermesBackendAdapter` 最小接入设计
+
+## 1. 背景
+
+Boss 现在已经有一层稳定的执行底座：
+
+- `ExecutionBackend`
+- `ExecutionBackendSelector`
+- `PromptAssembler`
+- `PermissionPolicy`
+- `RemoteRuntimeAdapter`
+
+并且已经接过两个外部项目：
+
+- `ClawBackendAdapter` 负责单次执行候选
+- `OmxTeamBackendAdapter` 负责编排候选
+
+`hermes-agent` 的最新上游形态更像“重型 agent runner”，而不是 Boss 的产品层替代物。它的价值主要集中在：
+
+- 成熟的单次 agent loop
+- CLI / Gateway / ACP 多入口
+- 丰富的 toolset 体系
+- 内建 skills / memory / session / search / delegation
+
+结论不是“把 Hermes 整体搬进 Boss”，而是把它当成新的**执行后端候选**接进 Boss 现有底座。
+
+---
+
+## 2. 第一批目标
+
+第一批只做一件事：
+
+> 在不改变 Boss 当前生产主链默认行为的前提下，为 `master-agent` 新增一个默认关闭、可显式启用的 `hermes-runtime`。
+
+用户能在 Boss 现有的主 Agent 对话控制里选择 `Hermes Runtime`，然后让当前 `master-agent` 的回复通过 Hermes CLI 完成。
+
+这批做完后，Boss 获得的是：
+
+- 一个新的可选执行后端
+- 一个可继续升级的 Hermes 适配点
+- 对现有 `claw-runtime` / API / Master Codex Node 主链零破坏
+
+---
+
+## 3. 明确不做
+
+这一批明确不做：
+
+- 不把 Hermes 的 gateway、Telegram、Discord、Slack、WhatsApp、Signal 接进 Boss
+- 不把 Hermes 的 Honcho memory 直接并入 Boss 的 `threadStatusDocuments / threadProgressEvents`
+- 不让 Boss 前台直接理解 Hermes 的 sessions、slash commands、toolsets 内部结构
+- 不接 Hermes 的多平台消息收发
+- 不接 Hermes 的 cron / ACP / editor integration
+- 不改 Boss 群聊编排链路
+- 不把 Hermes 当成新的 OrchestrationBackend
+
+这一批的定位非常明确：**只加一个执行后端，不加一个新产品子系统。**
+
+---
+
+## 4. 为什么 Hermes 值得接
+
+### 4.1 对主 Agent 的意义
+
+对 Boss 的主 Agent 来说，Hermes 的参考价值主要有三层：
+
+1. **成熟的一次性 agent loop**
+   - `hermes chat -q ... -Q` 已经提供了可脚本化的单次非交互执行入口。
+   - 这正好适合 Boss 当前 `ExecutionBackend` 的“单次请求 -> 单次结果”契约。
+
+2. **比 Claw 更完整的 agent runtime**
+   - Hermes 不只是命令封装，而是完整的 agent runner、tool registry、toolsets、skills、memory、delegation 体系。
+   - 这意味着它对复杂任务的“自己调工具完成”的能力更强，适合作为 Boss 主 Agent 的增强执行候选。
+
+3. **未来跨端能力的扩展面更大**
+   - Hermes 天然有 CLI / Gateway / ACP 三个入口。
+   - 这对 Boss 未来做“同一 agent 内核，挂不同入口”很有参考价值，但第一批先不展开。
+
+### 4.2 对 Boss 整体业务流程的意义
+
+Hermes 对 Boss 的真正价值不在 UI，而在**执行层的替换与对照实验**：
+
+- 同一条 Boss 产品链
+- 同一组项目、线程、审批、导入、群聊数据
+- 不同执行后端并行存在
+
+这样 Boss 可以把“产品层稳定”与“执行层可替换”真正做实。
+
+---
+
+## 5. 第一批接入方案
+
+### 5.1 接入位置
+
+Hermes 第一批接在：
+
+- `src/lib/execution/backends/`
+- `ExecutionBackendSelector`
+- `master-agent` 对话控制
+
+不接：
+
+- `OrchestrationBackend`
+- `local-agent` dispatch_execution
+- Android 独立配置页
+
+### 5.2 运行方式
+
+Boss 不直接 import Hermes Python 代码，也不把 Hermes vendoring 进仓库。
+
+Boss 通过外部命令调用 Hermes：
+
+```text
+<configured command> <configured prefix args> chat -q "<executionPrompt>" -Q --source tool
+```
+
+按需追加：
+
+- `-m <model>`
+- `-t <toolsets>`
+- `-s <skills>`
+
+这样做的原因：
+
+- 上游升级成本最低
+- Boss 与 Hermes 保持进程边界
+- 可以像 `claw-runtime` 一样通过命令 / workdir / args 做显式配置
+- 出问题时可直接回退，不污染主链
+
+### 5.3 输入输出契约
+
+Boss -> Hermes：
+
+- `executionPrompt`
+- `modelOverride`
+- `reasoningEffortOverride`（第一批只透传给 Boss 自身记录，不强行映射到 Hermes CLI 参数）
+- 可选 `toolsets`
+- 可选 `skills`
+
+Hermes -> Boss：
+
+- `stdout` 主体内容作为最终回复
+- quiet mode 末尾的 `session_id: ...` 只做解析，不写回会话正文
+
+如果 Hermes 非零退出、超时、输出为空或结构不可解析，统一转成：
+
+- `ExecutionImmediateFailedResult`
+
+---
+
+## 6. 配置设计
+
+新增环境变量：
+
+- `BOSS_HERMES_ENABLED`
+- `BOSS_HERMES_COMMAND`
+- `BOSS_HERMES_ARGS`
+- `BOSS_HERMES_WORKDIR`
+- `BOSS_HERMES_TIMEOUT_MS`
+- `BOSS_HERMES_DEFAULT_MODEL`
+- `BOSS_HERMES_TOOLSETS`
+- `BOSS_HERMES_SKILLS`
+
+设计约束：
+
+- 默认关闭
+- `enabled=true` 时若未显式设置 `BOSS_HERMES_COMMAND`，默认尝试 `hermes`
+- 可执行入口不存在、工作目录不存在、前置脚本不存在时，前台不允许选择
+
+---
+
+## 7. Boss 内部改动点
+
+### 7.1 新增文件
+
+- `src/lib/execution/backends/hermes-config.ts`
+- `src/lib/execution/backends/hermes-runner.ts`
+- `src/lib/execution/backends/hermes-backend.ts`
+- `scripts/hermes-runtime-smoke.mjs`
+
+### 7.2 修改文件
+
+- `src/lib/execution/backend-selector.ts`
+- `src/lib/boss-data.ts`
+- `src/lib/boss-master-agent.ts`
+- `src/app/api/v1/projects/[projectId]/agent-controls/route.ts`
+- `src/app/api/v1/projects/[projectId]/prompt-profile/route.ts`
+- `src/app/me/master-agent/page.tsx`
+- `src/components/master-agent-prompt-memory-client.tsx`
+
+### 7.3 测试
+
+新增或扩展：
+
+- `tests/hermes-backend-config.test.ts`
+- `tests/hermes-runner.test.ts`
+- `tests/hermes-backend.test.ts`
+- `tests/execution-backend-selector.test.ts`
+- `tests/master-agent-chat-controls.test.ts`
+- `tests/master-agent-message-queue.test.ts`
+
+---
+
+## 8. 关键取舍
+
+### 8.1 第一批不做 session 级复用
+
+Hermes quiet mode 会回 `session_id`，但 Boss 第一批不把它纳入正式状态模型。
+
+原因：
+
+- Boss 现有 `ExecutionImmediateResult` 没有 session 归档职责
+- 当前目标是先把“后端切换可用”做通
+- 先引入 session 绑定会把 `thread/session` 关联、恢复、清理都一起带进来，范围会失控
+
+第一批策略：
+
+- 解析但不持久化 `session_id`
+- 先跑通 stateless one-shot backend
+
+### 8.2 第一批不把 Boss PermissionPolicy 动态映射成 Hermes toolsets
+
+Hermes 的 toolsets 很强，但 Boss 现在没有现成的“工具级权限 -> Hermes CLI 参数”双向映射层。
+
+第一批策略：
+
+- 只支持静态环境变量指定 `toolsets` / `skills`
+- Boss 自己仍保留顶层产品审批与后端选择权
+
+这样虽然不够极致，但风险最小。
+
+### 8.3 第一批只开放给 `master-agent`
+
+这是刻意收口，不是能力缺陷。
+
+原因：
+
+- 当前 Boss 的后端 override UI 只在主 Agent 对话控制里成熟
+- `dispatch_execution` 牵涉群聊编排与设备执行链，接 Hermes 会把范围扩大到第二批
+- 先让主 Agent 对话用起来，收益最大、回归面最小
+
+---
+
+## 9. 验收标准
+
+满足以下条件才算第一批完成：
+
+1. 未启用 `BOSS_HERMES_*` 时，Boss 行为与当前完全一致
+2. 启用且配置正确后，`master-agent` 对话控制中可选择 `Hermes Runtime`
+3. 选择 `Hermes Runtime` 后，主 Agent 单次回复能通过 Hermes CLI 返回结果
+4. Hermes 不可用时，保存接口会拒绝选择，并返回明确原因
+5. 历史保存了 `hermes-runtime` 但当前不可用时，运行时会自动回退到默认后端，并给出人类可读说明
+6. `npm run build`
+7. `npm run lint`（当前需排除仓库外来大文件噪音后）
+8. 相关测试通过
+
+---
+
+## 10. 第二批预留方向
+
+第一批完成后，再考虑第二批：
+
+- Hermes session 级复用与 Boss 项目线程关联
+- Boss 权限策略到 Hermes toolsets 的映射
+- `thread_reply` 正式挂接
+- Android 主 Agent 设置页显示 Hermes 可用性
+- `dispatch_execution` 是否引入 Hermes 作为编排前执行候选
+
+第一批的成功标准不是“把 Hermes 接满”，而是：
+
+> 让 Boss 在现有产品层完全不变的情况下，稳定多出一个可选择、可回退、可继续升级的重型 agent 执行后端。
--- a/docs/superpowers/specs/2026-04-16-master-agent-fast-path-design.md
+++ b/docs/superpowers/specs/2026-04-16-master-agent-fast-path-design.md
@@ -0,0 +1,101 @@
+# 主 Agent Fast Path 设计
+
+更新时间：`2026-04-16`
+
+## 1. 背景
+
+主 Agent 当前同时承担两类请求：
+
+- 需要深度理解、规划、协调的慢路径问题
+- 只需要读取本地状态、快速执行配置变更的快路径问题
+
+在原实现里，只有少量模型切换命令会本地直答。大量高频问题，例如“你现在是什么大模型”“当前后端是什么”，仍会落入主推理链，导致：
+
+- 回复延迟明显偏高
+- 消耗不必要的 token
+- 用户需要记忆更机械的命令句式
+
+## 2. 设计目标
+
+- 把“状态查询 / 配置操作 / 枚举类问题”从主推理链中剥离出来
+- 对高频确定性问题提供本地直答，不进入异步队列
+- 保持自然语言容错，不要求用户输入完全标准化命令
+- 为后续继续扩展快路径意图提供统一入口
+
+## 3. 方案
+
+### 3.1 Fast Path Router
+
+在 `src/lib/boss-master-agent.ts` 中新增主 Agent 快路径路由层：
+
+- 统一先构造 `MasterAgentFastIntentContext`
+- 上下文内聚合：
+  - 当前登录用户作用域下的 `agentControls`
+  - 可见模型列表
+  - 可用模型列表
+  - 当前聊天意图的实际模型策略
+  - 深度任务意图的实际模型策略
+  - 当前 runtime 主控来源
+- `replyToMasterAgentUserMessage()` 在进入主链前先尝试 `tryHandleMasterAgentFastIntent()`
+
+### 3.2 第一批接入意图
+
+- 模型列表查询
+  - 例：“有哪些模型可以用”
+- 模型切换
+  - 例：“把快模型切到 gpt5.4mini”
+- 当前模型状态查询
+  - 例：“你现在是什么大模型”
+  - 例：“当前主模型是什么”
+  - 例：“快模型是什么”
+  - 例：“强模型是什么”
+- 当前后端状态查询
+  - 例：“当前后端是什么”
+
+### 3.3 自然语言归一化
+
+模型名解析新增轻量归一化：
+
+- 忽略 `- / _ / 空格 / .`
+- 兼容 `gpt5.4`、`gpt 5.4`、`gpt_5_4`
+- 仍然会回落到系统当前可选模型列表做映射
+
+## 4. 展示规则
+
+主 Agent 快路径回复的发送者名称改成：
+
+- `主Agent·<当前模型名>`
+
+Android 主 Agent 会话页顶部标题同步显示：
+
+- `主Agent·<当前模型名>`
+
+如果当前拿不到显式模型，则退回：
+
+- `主Agent`
+
+## 5. 当前收益
+
+- “查状态 / 查配置 / 改模型”这一类问题能直接秒回
+- 主 Agent 不再因为低价值查询进入慢路径
+- 用户不用死记标准句式
+- 后续扩展更多快路径意图时，只需要继续往 router 中追加 handler
+
+## 6. 后续扩展建议
+
+下一批适合继续接入 Fast Path 的问题：
+
+- 当前绑定设备 / 在线设备查询
+- 当前会话 / 当前线程运行状态查询
+- GUI / CLI 默认执行模式查询
+- 当前接管开关状态查询
+- 最近活跃项目 / 最近活跃线程查询
+
+## 7. 验证基线
+
+本次落地至少要求以下验证通过：
+
+- `npx tsx --test tests/master-agent-message-queue.test.ts tests/master-agent-chat-controls.test.ts`
+- `npm run build`
+- `./gradlew :app:compileDebugJavaWithJavac :app:assembleDebug`
+- 真机安装并验证主 Agent 名称与模型查询行为