ai-device/intelligent_cabin/archive/docs/current_system_flow.md

# 当前项目完整流程说明

## 1. 文档目标

本文档描述当前项目在代码层面的真实运行流程，不是目标态蓝图，而是“现在这个仓库实际如何工作”。

覆盖范围包括：

- 服务启动与运行时装配
- API 请求入口与返回结构
- 会话状态管理
- 输入改写
- 本地路由与多阶段融合
- 社交闲聊分流
- planner 触发与多步骤 workflow 生成
- 单步执行、多步执行、条件执行、确认执行
- 配置驱动加载方式
- 当前使用的技术、模型、阈值、分支条件
- 主要风险点与当前边界

---

## 2. 系统定位

当前项目是一个面向“智能座舱 + 客服”的执行型 Agent 服务，后端基于 FastAPI，核心特点如下：

- 配置驱动：意图、动作、响应模板、表单、规则、workflow 模板都来自 `config/`
- 本地优先：优先走 `rewrite -> keyword/classifier/retrieval -> fusion`
- planner 不是默认入口：只有复杂句、多意图、条件句、低置信句才触发 planner
- 会话显式维护：多轮状态不依赖 LLM 记忆，而依赖 `SessionState`
- 社交闲聊单独分流：问候、感谢、能力问答、开放闲聊走 `SocialRouter`
- 插件执行统一抽象：所有意图最终映射到 `plugin_id`

---

## 3. 总体架构图

```mermaid
flowchart TD
    A[HTTP 请求 /api/v1/agent/chat] --> B[AgentService.handle_chat]
    B --> C[SessionStore get_or_create]
    C --> D[Dialog Act 更新]
    D --> E{是否停止当前任务}
    E -->|是| F[停止任务并返回 stopped]
    E -->|否| G{是否社交闲聊}
    G -->|是| H[SocialRouter + SocialResponder]
    G -->|否| I[ContextRewriteEngine 改写]
    I --> J[Router.route]

    J --> J1[keyword matcher]
    J --> J2[classifier matcher]
    J --> J3[retrieval matcher]
    J1 --> J4[fusion 决策]
    J2 --> J4
    J3 --> J4

    J4 --> K{是否需要 planner}
    K -->|否| L{fusion 决策结果}
    K -->|是| M[WorkflowPlanner.plan]

    M --> M1[TemplateWorkflowPlanner]
    M --> M2[HeuristicWorkflowPlanner]
    M --> M3[DashScopeWorkflowPlanner 可选]
    M1 --> N{planner accepted?}
    M2 --> N
    M3 --> N

    N -->|是| O[构建 Workflow 并执行]
    N -->|否| L

    L -->|execute| P[槽位提取]
    P --> Q{缺槽位?}
    Q -->|是| R[ask_slot]
    Q -->|否| S[执行单步插件]
    L -->|clarify| T[clarify]
    L -->|reject| U[reject]
    L -->|route_to_cloud| V[clarify / fallback / reject]

    O --> W{多步流程继续执行}
    W -->|缺槽位| X[ask_slot]
    W -->|需确认| Y[ask_confirmation]
    W -->|条件不满足| Z[skip step]
    W -->|可执行| AA[调用插件]
    AA --> AB[workflow_summary]
```

---

## 4. 启动与装配流程

### 4.1 入口

应用入口是 `app/main.py`，FastAPI 在 import 阶段完成以下事情：

1. 初始化 `FastAPI(title=settings.app_name)`
2. 初始化 demo runtime 配置
3. 调用 `build_agent_service_with_runtime(...)`
4. 调用 `build_intent_registry()`
5. 暴露 `/health`、`/demo`、`/api/v1/agent/chat`、`/api/v1/agent/chat-stream`、`/api/v1/agent/fill-slots`

### 4.2 运行时装配

`build_agent_service_with_runtime()` 会组装完整执行链：

1. `ConfigLoader.load()` 读取配置文件
2. 构建 `IntentRegistry`
3. 构建 classifier
4. 构建 multi-intent detector
5. 构建 matcher pipeline 和 router
6. 构建 session store
7. 构建 `ResponsePolicy`
8. 构建 `DialogRuleEngine`
9. 构建 `DialogActEngine`
10. 构建 planner
11. 注册 mock plugin
12. 构建 `SocialRouter` 和 `DashScopeSocialResponder`
13. 最终实例化 `AgentService`

### 4.3 启动装配图

```mermaid
flowchart TD
    A[build_agent_service_with_runtime] --> B[ConfigLoader.load]
    B --> C[IntentRegistry]
    B --> D[Response templates]
    B --> E[Dialog rules]
    B --> F[Dialog acts]
    B --> G[Workflow templates]

    A --> H[build_classifier]
    H --> H1[MockIntentClassifier]
    H --> H2[BertIntentClassifier]
    H --> H3[RemoteIntentClassifier]

    A --> I[build_multi_intent_detector]
    I --> I1[BertMultiIntentDetector]

    A --> J[build_router]
    J --> J1[build_matcher_pipeline]
    J --> J2[HeuristicSlotExtractor]

    A --> K[build_session_store]
    K --> K1[InMemorySessionStore]
    K --> K2[RedisSessionStore]

    A --> L[build_planner]
    L --> L1[TemplateWorkflowPlanner]
    L --> L2[HeuristicWorkflowPlanner]
    L --> L3[DashScopeWorkflowPlanner]

    A --> M[MockPluginExecutor.register]
    A --> N[SocialRouter]
    A --> O[DashScopeSocialResponder]
    A --> P[AgentService]
```

---

## 5. 配置驱动体系

当前运行时主要依赖这些配置文件：

- `config/domain.yml`
- `config/actions.yml`
- `config/forms.yml`
- `config/responses.yml`
- `config/rules.yml`
- `config/dialog_acts.yml`
- `config/workflows.yml`

### 5.1 各配置文件职责

`domain.yml`

- 定义意图目录
- 定义 `intent_id`
- 定义领域 `domain`
- 定义 label、keywords、examples
- 指向 `action_id`

`actions.yml`

- 将 `action_id` 绑定到 `plugin_id`
- 指定风险等级、描述等动作元信息

`forms.yml`

- 定义每个意图需要的 `required_slots`
- 定义槽位缺失时的 `ask_templates`

`responses.yml`

- 定义系统通用话术模板
- 包含 ask / ack / reject / fallback / confirm 等模板

`rules.yml`

- 定义停止词
- 定义确认正负词
- 定义哪些 intent 或哪些 risk level 必须确认

`dialog_acts.yml`

- 定义 `affirm / deny / cancel / modify / chitchat / request / inform`

`workflows.yml`

- 定义局部固定 workflow 模板
- 当前已有顺序模板和条件模板

### 5.2 加载流程

```mermaid
flowchart LR
    A[ConfigLoader] --> B[domain.yml]
    A --> C[actions.yml]
    A --> D[forms.yml]
    A --> E[responses.yml]
    A --> F[rules.yml]
    A --> G[dialog_acts.yml]
    A --> H[workflows.yml]

    B --> I[DomainConfig]
    C --> J[ActionsConfig]
    D --> K[FormsConfig]
    E --> L[ResponsesConfig]
    F --> M[DialogRulesConfig]
    G --> N[DialogActsConfig]
    H --> O[WorkflowTemplatesConfig]

    I --> P[IntentRegistry]
    J --> P
    K --> P
    L --> Q[ResponsePolicy]
    M --> R[DialogRuleEngine]
    N --> S[DialogActEngine]
    O --> T[WorkflowPlanner]
```

---

## 6. 请求入口层

### 6.1 `/api/v1/agent/chat`

这是主同步接口：

1. 接收 `ChatRequest`
2. 直接调用 `agent_service.handle_chat(request)`
3. 返回 `ChatResponse`

### 6.2 `/api/v1/agent/chat-stream`

这是流式接口：

1. 用线程池异步执行 `agent_service.handle_chat`
2. 如果 1 秒内拿到结果，直接返回 `final`
3. 如果 1 秒内没拿到结果：
   - 先检查 `_should_emit_processing_hint(text)`
   - 如果命中工具型关键词，就先发一个 `ack`
   - 再等待最终结果
4. 最终以 NDJSON 流输出

注意：

- 这里的流式 `ack` 仍然是“基于输入文本 token 的启发式判断”
- 它并不等于“真实插件已开始执行”
- 这是 HTTP streaming 层的反馈策略，不是 AgentService 内部 workflow 执行层

### 6.3 `/api/v1/agent/fill-slots`

这是补槽位和确认续跑接口：

1. 根据 `session_id` 取已有 session
2. 若没有有效 session 或没有 `current_intent`，返回 fallback
3. 如果当前在 `waiting_confirmation`，优先处理确认
4. 如果当前是多步 workflow，继续推进 workflow
5. 否则按当前 intent 继续抽槽并补全

---

## 7. Session 设计

### 7.1 SessionState 字段

`SessionState` 当前包含：

- `session_id`
- `user_id`
- `channel`
- `status`
- `current_intent`
- `pending_slots`
- `slots`
- `workflow`
- `routing_debug`
- `last_user_text`
- `last_agent_text`
- `context_memory`

### 7.2 状态含义

当前项目中常见状态：

- `idle`
- `understanding`
- `waiting_slot`
- `waiting_confirmation`
- `running`
- `completed`
- `rejected`
- `fallback`
- `stopped`
- `social`

### 7.3 Session Backend

支持两种 session backend：

- `memory`
- `redis`

`memory`

- 进程内字典保存
- 重启丢失
- 适合本地开发

`redis`

- JSON 序列化保存
- key 带前缀和 TTL
- 适合跨进程和长会话

### 7.4 Session 状态图

```mermaid
stateDiagram-v2
    [*] --> idle
    idle --> understanding: 收到普通任务
    idle --> social: 收到社交闲聊
    understanding --> waiting_slot: 缺槽位
    understanding --> ready_to_execute: 槽位齐全
    ready_to_execute --> completed: 单步插件执行完成
    ready_to_execute --> running: 多步 workflow 开始
    running --> waiting_slot: 某一步缺槽位
    running --> waiting_confirmation: 某一步需确认
    waiting_slot --> running: 用户补槽后继续
    waiting_confirmation --> running: 用户确认继续
    waiting_confirmation --> completed: 用户取消该步后流程继续或结束
    understanding --> rejected: reject
    understanding --> fallback: fallback
    running --> stopped: 用户停止
    completed --> idle: 下轮新任务
```

---

## 8. AgentService 主流程

`AgentService.handle_chat()` 是整个服务的核心编排器。

### 8.1 主链路顺序

1. 读取或创建 session
2. 更新 dialog act
3. 检查是否停止任务
4. 检查是否社交闲聊
5. 运行 rewrite
6. 路由匹配
7. 判断是否要启 planner
8. 如果 planner 接受，直接走多步 workflow
9. 如果 planner 不接受，则按 fusion 决策处理
10. 如果决策是 execute，则抽槽并执行
11. 记录 turn
12. 保存 session
13. 填充 latency breakdown

### 8.2 主流程图

```mermaid
flowchart TD
    A[handle_chat] --> B[get_or_create session]
    B --> C[update dialog act]
    C --> D{stop request?}
    D -->|yes| E[reset active task + return stopped]
    D -->|no| F{social turn?}
    F -->|yes| G[build social response]
    F -->|no| H[rewrite]
    H --> I[router.route]
    I --> J{should use planner?}
    J -->|yes| K[planner.plan]
    K --> L{accepted and has steps?}
    L -->|yes| M[start planned workflow]
    L -->|no| N[handle route decision]
    J -->|no| N
    N -->|has response| O[clarify / reject / route_to_cloud]
    N -->|none| P{intent found?}
    P -->|no| Q[fallback]
    P -->|yes| R[extract slots]
    R --> S[update session + context memory]
    S --> T[build response from session]
    T --> U[record turn + save session + finalize]
```

### 8.3 Timing 统计

当前 `ChatResponse` 会附带处理耗时细分，典型字段包括：

- `session_get_or_create_ms`
- `dialog_act_ms`
- `stop_check_ms`
- `social_route_ms`
- `rewrite_ms`
- `route_ms`
- `planner_ms`
- `decision_response_ms`
- `slot_extract_ms`
- `response_build_ms`
- `record_turn_ms`
- `session_save_ms`
- `match_pipeline_ms`
- `first_response_latency_ms`
- `total_latency_ms`

---

## 9. Dialog Act 与停止分支

### 9.1 Dialog Act

系统会在请求一开始调用 `DialogActEngine.detect(text)`，当前 act 包括：

- `affirm`
- `deny`
- `cancel`
- `modify`
- `chitchat`
- `request`
- `inform`
- `unknown`

当前 dialog act 的用途主要是：

- 保存会话理解上下文
- 给后续确认/修改/闲聊处理提供辅助信号

### 9.2 停止分支

停止请求由 `DialogRuleEngine.is_stop_request()` 判断，典型词包括：

- 不用了
- 算了
- 先这样吧
- 停一下
- 停止
- 结束这次操作

只有满足以下两个条件才会真正停止：

1. 输入命中 stop phrase
2. session 当前有 active task

命中后动作：

- reset active task
- 清空 pending_slots
- workflow 置空
- 返回 `status=stopped`

---

## 10. 社交闲聊分支

### 10.1 为什么先做社交分流

社交闲聊与任务型请求的目标不同：

- 任务型请求要落到 intent / slot / workflow / plugin
- 社交闲聊只需要自然回复，不应误触发任务执行

### 10.2 SocialRouter 路由逻辑

`SocialRouter.route(text, session)` 流程如下：

1. 归一化文本
2. 如果看起来像任务型请求，直接返回 `category=none`
3. 如果命中短社交模式，返回 `open_social`
4. 如果命中 capability regex，返回 `open_social`
5. 如果命中开放闲聊短语或 regex，返回 `open_social`
6. 如果上一轮是 `open_social` 且当前文本很短，作为闲聊续接
7. 否则不是社交

### 10.3 社交分支的两个层次

第一层：`SocialRouter`

- 只判断要不要走社交链路

第二层：`DashScopeSocialResponder`

- 真正生成自然语言回复
- 使用 DashScope 兼容 OpenAI 的 `/chat/completions`
- `temperature=0.6`
- `max_tokens=120`
- 系统提示词要求：
  - 简短
  - 口语化
  - 不编造已执行动作
  - 不输出 JSON

### 10.4 社交回复的补充逻辑

如果当前 session 还有未完成任务：

- `ResponsePolicy.pending_task_hint()` 会追加提示
- 例如还在等确认、还在等槽位、还在运行中

因此社交回复不一定纯闲聊，可能附带任务续接提示

### 10.5 社交流程图

```mermaid
flowchart TD
    A[用户输入] --> B[SocialRouter.route]
    B --> C{looks_like_task?}
    C -->|yes| D[回主任务链路]
    C -->|no| E{命中社交模式?}
    E -->|no| D
    E -->|yes| F{是否配置 social responder?}
    F -->|no| G[open_social_fallback]
    F -->|yes| H[DashScopeSocialResponder.reply]
    H --> I{返回为空或报错?}
    I -->|yes| G
    I -->|no| J[自然回复]
    G --> K[附加 pending task hint]
    J --> K
    K --> L[返回 social/text response]
```

---

## 11. Rewrite 流程

### 11.1 Rewrite 的定位

`ContextRewriteEngine` 不是通用 LLM 改写器，而是轻量、本地、规则型短句恢复器。

目的：

- 把“再低一点”“下一首”“不要高速”这种短句补成完整命令
- 让下一轮仍然能命中本地快链路
- 减少每轮都走 planner 或云端

### 11.2 当前支持的 rewrite 场景

空调 follow-up

- 当前 intent 属于空调相关：
  - `cabin_set_ac`
  - `cabin_ac_on`
  - `cabin_ac_off`
  - `cabin_fan_up`
  - `cabin_fan_down`
- 输入匹配“高一点 / 低一点 / 调高一点 / 调低一点”
- 根据 `context_memory.last_temperature` 或 `session.slots.temperature`
- 生成显式句式：
  - `把空调调到 21 度`

音乐 follow-up

- 当前 intent 为 `cabin_play_music`
- 输入匹配：
  - 再来一首
  - 换一首
  - 下一首
- 改写为：
  - `播放下一首歌`

导航 follow-up

- 当前 intent 为 `cabin_nav_to`
- 如果输入含“不要高速”
- 且 session 中能取到上次目的地
- 改写为：
  - `导航去 xxx，不要高速`

### 11.3 Rewrite 分支图

```mermaid
flowchart TD
    A[原始输入] --> B{current_intent}
    B -->|空调相关| C[_rewrite_ac_adjustment]
    B -->|播放音乐| D[_rewrite_music_followup]
    B -->|导航| E[_rewrite_navigation_followup]
    B -->|其他| F[不改写]
    C --> G{命中规则?}
    D --> G
    E --> G
    G -->|yes| H[输出 rewritten_text + metadata]
    G -->|no| F
```

---

## 12. Router 与本地多阶段匹配

### 12.1 Router 组成

当前 Router 是 `RuleBasedRouter`，内部包含：

- `matcher`: `MultiStageIntentMatcher`
- `slot_extractor`: `HeuristicSlotExtractor`

### 12.2 Matcher Pipeline

可配置的匹配阶段：

- `keyword`
- `classifier`
- `retrieval`
- `llm` 占位，当前未实现

默认项目原则上使用：

- `keyword,classifier,retrieval`

### 12.3 各阶段职责

`KeywordIntentMatcher`

- 遍历每个 intent 的 keywords
- 命中即给 1.0 分
- 候选 reason 是 `keyword:xxx`

`ClassifierIntentMatcher`

- 调用 `IntentClassifier.predict`
- 可能接后端：
  - `mock`
  - `bert`
  - `remote`
- 输出 top-k candidates
- 附带 raw_label、raw_candidates、top_margin、fallback_reason

`RetrievalIntentMatcher`

- 对输入做 token / n-gram 切分
- 与 intent 的 keywords + examples 做重叠评分
- 分数来源：
  - overlap
  - coverage
  - keyword bonus

`LlmIntentMatcher`

- 当前是占位实现
- 永远返回 not implemented

### 12.4 Fusion 评分机制

fusion 会聚合所有 stage 的候选：

- keyword 权重：`1.15`
- classifier 权重：`1.0`
- retrieval 权重：`0.75`
- llm 权重：`1.1`

额外规则：

- 某 stage accepted 且 selected_intent 命中，会加 accepted bonus
- classifier 若走 fallback，会减分
- BERT classifier 会用不同归一化规则
- fusion 最终得到 ranked intents

### 12.5 Fusion 决策阈值

核心阈值来自环境变量：

- `local_execute_threshold = 1.65`
- `local_route_to_cloud_threshold = 0.75`
- `local_clarify_margin_threshold = 0.12`
- `local_classifier_execute_score_threshold = 0.55`
- `local_classifier_execute_margin_threshold = 0.18`

### 12.6 Fusion 决策逻辑

`execute`

满足任一：

- `top_score >= execute_threshold`
- 至少两路 accepted 支持，或有强 symbolic support
- 纯 BERT 路径但分数和 margin 足够高

`clarify`

满足：

- top_score 介于 `route_to_cloud_threshold` 和 `execute_threshold`
- top_margin 小于 clarify margin
- 存在多候选竞争

`route_to_cloud`

满足：

- 本地信号说明“像已知意图”
- 但不够稳定执行

`reject`

满足：

- 本地没有足够信号
- 且不认为是已知能力内请求

### 12.7 Router 总流程图

```mermaid
flowchart TD
    A[rewrite 后文本] --> B[keyword]
    A --> C[classifier]
    A --> D[retrieval]
    B --> E[fusion]
    C --> E
    D --> E

    E --> F{top_score / margin / support}
    F -->|高置信| G[decision=execute]
    F -->|接近但歧义| H[decision=clarify]
    F -->|像已知意图但不稳| I[decision=route_to_cloud]
    F -->|未知或过低| J[decision=reject]
```

### 12.8 Routing Debug

每次 route 都会构造 `RoutingDebug`，包括：

- `selected_intent`
- `matched_stage`
- `decision`
- `decision_reason`
- `confidence_grade`
- `unknown_detected`
- `stages`
- `total_match_latency_ms`

每个 stage 又包含：

- accepted
- selected_intent
- score
- reason
- model_name
- backend
- fallback_used
- raw_label
- error_message
- metadata
- candidates

这也是 demo 面板能展示详细匹配过程的原因。

---

## 13. Classifier 技术细节

### 13.1 支持的 classifier backend

`mock`

- 本地规则化 token overlap stub
- 主要用于开发或 fallback

`bert`

- 使用 `transformers.pipeline("text-classification")`
- 模型目录来自 `AGENT_CLASSIFIER_MODEL_PATH`
- 支持 label map
- 支持 warmup

`remote`

- 请求远端分类服务
- 传入 text、top_k、labels
- 返回 intent_id / score / raw candidates

### 13.2 BERT classifier 行为

本地 BERT 分类器：

- 初始化时不一定立刻加载模型，但 `build_classifier()` 会按配置触发 warmup
- warmup 默认文本：`打开车窗`
- 输出 top-k candidates
- 分数低于阈值时可 fallback 到 mock

### 13.3 当前模型相关配置

- `classifier_backend`
- `classifier_model_path`
- `classifier_label_map_path`
- `classifier_top_k`
- `classifier_bert_threshold`
- `classifier_warmup_enabled`
- `classifier_warmup_text`

---

## 14. Multi-Intent Detector 技术细节

### 14.1 当前 detector 的角色

当前多标签 detector 不是 planner 主入口，而是 planner 二阶段中的先验信号。

它的职责是：

- 从整句维度给出可能共现的多个 intent
- 生成 `detector_prior`
- 在 clause classifier 融合时增强多意图解析

### 14.2 当前实现方式

`BertMultiIntentDetector`

- 加载独立多标签模型目录
- 运行 `AutoTokenizer` + `AutoModelForSequenceClassification`
- 对 logits 做 sigmoid
- 用 threshold 过滤
- 屏蔽：
  - `__social__`
  - `__out_of_scope__`
- 最多输出 `max_labels`

### 14.3 detector 配置

- `planner_multi_intent_detector_enabled`
- `planner_multi_intent_detector_model_path`
- `planner_multi_intent_detector_threshold`
- `planner_multi_intent_detector_top_k`
- `planner_multi_intent_detector_max_labels`

---

## 15. Planner 触发机制

### 15.1 何时调用 planner

`AgentService._should_use_planner()` 当前规则：

满足任一就会进入 planner：

1. 文本包含复杂连接词：
   - 然后
   - 并且
   - 同时
   - 如果
   - 若
   - 先
   - 后
2. 文本包含明显分隔符：
   - `，`
   - `,`
   - `；`
   - `;`
3. fusion stage 最终未 accepted

这意味着：

- planner 不是全量调用
- 单条明确命令通常不会调 planner
- 复杂句、歧义句、低置信句会调 planner

### 15.2 planner 组合结构

当前 `build_planner()` 构成如下：

本地层：

- `TemplateWorkflowPlanner`
- `HeuristicWorkflowPlanner`

云端层：

- `DashScopeWorkflowPlanner`

最终组合方式：

- 如果 `planner_backend=heuristic`
  - 返回 `Composite(local_template, local_heuristic)`
- 如果 `planner_backend=dashscope`
  - 返回 `Composite(local_first, dashscope)`

重点：

- 始终是 local-first
- 云端 planner 只是后面的补充层
- 云端失败时会 fallback 到本地 heuristic planner

### 15.3 planner 链路图

```mermaid
flowchart TD
    A[planner.plan] --> B[TemplateWorkflowPlanner]
    B --> C{accepted?}
    C -->|yes| D[返回模板 workflow]
    C -->|no| E[HeuristicWorkflowPlanner]
    E --> F{accepted?}
    F -->|yes| G[返回本地 heuristic workflow]
    F -->|no| H{planner_backend=dashscope?}
    H -->|no| I[返回未接受]
    H -->|yes| J[DashScopeWorkflowPlanner]
    J --> K{云端成功?}
    K -->|yes| L[返回 cloud workflow]
    K -->|no| M[fallback 到本地 heuristic 结果]
```

---

## 16. TemplateWorkflowPlanner 细节

### 16.1 作用

对已经定义好的典型固定组合优先做模板命中。

当前 `workflows.yml` 里已有：

- `window_then_ac_sequence`
- `query_then_cancel_if_pending`

### 16.2 工作方式

1. 调 `_analyze_multi_intent()`
2. 拿到 clause 分析结果
3. 如果匹配的 intent 少于 2 个，不接受
4. 逐个比对 workflow 模板
5. 如果模板命中，则生成 `PlannedStep`

### 16.3 模板命中条件

需要同时满足：

- `matched_ids[:len(intent_sequence)] == template.intent_sequence`
- 若模板定义了 `trigger_keywords`，文本中必须都出现

这是一种严格模板匹配，不是模糊相似模板匹配。

---

## 17. HeuristicWorkflowPlanner 细节

### 17.1 输入分析

`_analyze_multi_intent()` 做这些事：

1. 如果配置了多标签 detector，先对整句做 multi-intent detect
2. 生成 `detector_prior`
3. 按连接词和标点做 clause split
4. 对每个 clause 做 heuristic + classifier + detector 融合
5. 汇总成 `MultiIntentParseResult`

### 17.2 Clause Split 规则

当前切分词包括：

- 然后
- 并且
- 同时
- 再
- 顺便
- 接着
- 并
- `,`
- `，`
- `；`
- `;`

### 17.3 Clause 评分信号

每个 clause 对每个 intent 的启发式评分来源：

- keyword 命中
- example 命中
- action 命中
- object 命中
- qualifier 命中
- shared context 命中
- 显式 temperature
- 显式 order_id

### 17.4 Clause 融合

最终 clause 分数 = 启发式分数 + `model_score * classifier_weight` + detector bonus + 一致性 bonus

补充规则：

- 如果启发式没有命中，但 model 分数足够高，允许 `bert_bootstrap`
- 如果 clause 内看起来像并列复合子句，还会尝试抽出多个 parallel candidates

### 17.5 Workflow 类型推断

- 如果文本有条件模式：
  - `如果`
  - `若`
  - `还没`
  - `未发货`
  - `没发货`
  - 则推断 `conditional`
- 否则若存在多 clause，通常推断 `sequence`
- 匹配数不足 2 个时，仍回落为 `single`

### 17.6 条件流程修正

`_apply_conditional_hints()` 当前专门处理：

- `cs_query_order -> cs_cancel_order`
- 且文本包含“还没发货 / 未发货 / 没发货”

则自动补：

- `depends_on=[query_step]`
- `condition.field=order_status`
- `condition.operator=equals`
- `condition.value=pending_shipment`
- `requires_confirmation=true`

---

## 18. DashScopeWorkflowPlanner 细节

### 18.1 调用方式

云端 planner 使用 DashScope 兼容 OpenAI 的 `POST /chat/completions`

配置项：

- `planner_base_url`
- `planner_api_key`
- `planner_model_name`
- `planner_timeout_seconds`

### 18.2 Prompt 设计目标

云端 planner 的 system prompt 要求：

- 只返回严格 JSON
- 只能使用 catalog 中已有的 `intent_id`
- 单命令返回一个 step
- 多命令返回 sequence steps
- 条件命令返回 conditional steps
- 显式抽取 slots
- 高风险动作标记 `requires_confirmation=true`

### 18.3 结果归一化

云端返回后会做：

1. 提取 content
2. 去掉 markdown code fence
3. 解析 JSON
4. 校验 intent 是否在 catalog 中
5. 合并 cloud slots + clause slots + full_text slots
6. 归一化 depends_on 和 condition
7. 如果文本本身是条件句，再补条件 hints

### 18.4 云端 planner 失败分支

若出现以下问题：

- 未配置
- HTTP 请求失败
- 超时
- 返回空内容
- JSON 非法

则返回 fallback 结果：

- backend 记为 `dashscope-fallback`
- 实际 steps 来自本地 heuristic planner

---

## 19. 单步执行流程

当 planner 没接管，且 fusion 决策为 `execute` 时，系统进入单步执行。

### 19.1 单步执行顺序

1. 如果当前识别的 intent 与 session 上一个 intent 不同：
   - 清空 `pending_slots`
   - 清空 `slots`
   - 清空 `workflow`
2. 更新 `session.current_intent`
3. `status=understanding`
4. 抽取槽位
5. 更新 `session.slots`
6. 更新 `context_memory`
7. 调 `_build_response_from_session()`

### 19.2 `_build_response_from_session()` 分支

如果缺槽位：

- `status=waiting_slot`
- 返回 `ask_slot`
- 生成单步 workflow 展示缺失字段

如果槽位齐全：

- 直接调用 plugin
- `status=completed`
- 返回 `workflow_result`

### 19.3 单步执行图

```mermaid
flowchart TD
    A[decision=execute] --> B[extract slots]
    B --> C[update session.slots]
    C --> D{pending slots?}
    D -->|yes| E[ask_slot + waiting_slot]
    D -->|no| F[plugins.execute]
    F --> G[workflow_result]
    G --> H[record turn + save session]
```

---

## 20. 多步 Workflow 执行流程

### 20.1 启动

如果 planner 返回：

- `accepted=True`
- 且 `steps` 非空

则 `AgentService._start_planned_workflow()` 会：

1. 调 `_build_planned_workflow()`
2. 将 workflow 写入 session
3. 将 routing_debug 写入 session
4. 立即调用 `_continue_planned_workflow()`

### 20.2 WorkflowStep 结构

每一步包含：

- `step`
- `step_id`
- `intent_id`
- `plugin_id`
- `action`
- `status`
- `depends_on`
- `slots`
- `condition`
- `requires_confirmation`

### 20.3 `_continue_planned_workflow()` 的核心逻辑

系统按 step 顺序遍历：

1. 如果 step 已完成或已跳过，跳过
2. 如果依赖步骤未完成，跳过
3. 合并 session slots 与 step slots
4. 做 intent 级 slot normalize
5. 检查当前 step 是否缺必填槽位
6. 如果缺槽位，暂停整个 workflow，返回 `ask_slot`
7. 若有 condition，先评估条件
8. 条件不满足则 `skipped`
9. 如果此步需要确认且还未确认，返回 `ask_confirmation`
10. 否则执行 plugin
11. 记录 step result
12. 更新 session slots 与 context memory
13. 收集每步 message
14. 全部结束后生成自然语言 workflow summary

### 20.4 多步执行图

```mermaid
flowchart TD
    A[workflow ready] --> B[for step in steps]
    B --> C{已完成或已跳过?}
    C -->|yes| B
    C -->|no| D{depends_on satisfied?}
    D -->|no| B
    D -->|yes| E[merge slots]
    E --> F{缺槽位?}
    F -->|yes| G[ask_slot + pause workflow]
    F -->|no| H{condition exists?}
    H -->|yes| I[评估条件]
    I -->|false| J[skip step]
    I -->|true/none| K{requires confirmation?}
    H -->|no| K
    K -->|yes| L[ask_confirmation + pause workflow]
    K -->|no| M[plugins.execute]
    M --> N[记录 step_results]
    N --> B
    B --> O[全部结束]
    O --> P[workflow_summary]
```

---

## 21. 条件流程与确认流程

### 21.1 条件流程

目前条件执行能力是 workflow 层完成的，不是 plugin 层完成的。

机制如下：

1. 某 step 带 `condition`
2. 条件里描述：
   - `source_step`
   - `field`
   - `operator`
   - `value`
3. 执行依赖步骤后，从 `workflow.meta.step_results[source_step]` 里取 `data`
4. 读出对应 field
5. 做 equals / not_equals / in 判断
6. 如果不满足：
   - step.status = skipped
   - 拼接 skip message

### 21.2 当前确认机制

确认触发来源有两种：

1. workflow step 显式写了 `requires_confirmation`
2. `DialogRuleEngine.requires_confirmation(intent_id, risk_level)` 返回 true

当前典型需要确认的动作：

- `cs_cancel_order`
- 风险等级为 `high` 的动作

### 21.3 确认处理流程

当 workflow 卡在确认步骤时：

1. `status=waiting_confirmation`
2. `pending_slots=["confirmation"]`
3. session.workflow.meta 里写入 `pending_confirmation`

之后用户通过 `fill-slots` 或后续续聊输入：

- 如果明确为正确认：
  - 将 step_id 加入 `confirmed_steps`
  - 继续跑 workflow
- 如果明确为负确认：
  - 将该 step 标记 `skipped`
  - 继续后面的 workflow
- 如果无法判断：
  - 返回 `confirm_retry`

### 21.4 确认时序图

```mermaid
sequenceDiagram
    participant U as 用户
    participant A as AgentService
    participant W as Workflow
    participant P as Plugin

    A->>W: 发现当前 step requires_confirmation
    W-->>U: ask_confirmation
    U->>A: 确认 / 取消 / 模糊回复
    A->>A: parse_confirmation_decision
    alt 明确确认
        A->>W: 标记 confirmed_steps
        A->>P: 执行该 step
        P-->>A: success
        A-->>U: workflow_result
    else 明确取消
        A->>W: 标记 step skipped
        A-->>U: 后续 summary
    else 无法判断
        A-->>U: confirm_retry
    end
```

---

## 22. Fill-Slots 续跑流程

`handle_fill_slots()` 是状态恢复和续跑的重要入口。

### 22.1 主要分支

1. 读取 session
2. 如果无 session 或无 current_intent，fallback
3. 更新 dialog act
4. 检查 stop request
5. 如果在 `waiting_confirmation`，优先尝试确认处理
6. 如果是 social turn，也可先走社交
7. 如果还在 `waiting_confirmation`，继续确认流程
8. 如果 session 内已有非 single workflow：
   - 只抽本轮补充槽位
   - 继续 `_continue_planned_workflow()`
9. 否则：
   - 按当前单 intent 抽槽并继续 `_build_response_from_session()`

### 22.2 为什么 fill-slots 很关键

它让系统具备这些能力：

- 缺槽位后可续跑
- 条件确认后可续跑
- 多步 workflow 在中断后可继续
- 不需要整轮重新识别 planner

---

## 23. Plugin 执行层

### 23.1 抽象方式

当前插件层由 `PluginRegistry` 统一管理：

- `register(plugin_id, handler)`
- `execute(plugin_id, slots)`

### 23.2 当前插件实现

现在实际接的是 `MockPluginExecutor`，它注册了一批 mock handler，包括：

客服类：

- `plugin.order.query`
- `plugin.logistics.query`
- `plugin.order.cancel`
- `plugin.service.transfer_human`

座舱类：

- `plugin.cabin.navigation`
- `plugin.cabin.navigation.cancel`
- `plugin.cabin.ac.on`
- `plugin.cabin.ac.off`
- `plugin.cabin.ac_control`
- `plugin.cabin.fan.up`
- `plugin.cabin.fan.down`
- `plugin.cabin.defog.front_on`
- `plugin.cabin.defog.rear_on`
- `plugin.cabin.window.open`
- `plugin.cabin.window.close`
- `plugin.cabin.sunroof.open`
- `plugin.cabin.sunroof.close`
- `plugin.cabin.doors.lock`
- `plugin.cabin.doors.unlock`
- `plugin.cabin.music_play`
- `plugin.cabin.music.pause`
- `plugin.cabin.music.next`
- `plugin.cabin.music.previous`
- `plugin.cabin.volume.up`
- `plugin.cabin.volume.down`
- `plugin.cabin.volume.mute`
- `plugin.cabin.lights.on`
- `plugin.cabin.lights.off`
- `plugin.cabin.seat_heat.on`
- `plugin.cabin.seat_heat.off`
- `plugin.cabin.mirror.fold`
- `plugin.cabin.mirror.unfold`
- `plugin.cabin.wiper.on`
- `plugin.cabin.wiper.off`

### 23.3 plugin 返回结构

每个 plugin 返回统一 dict：

- `success`
- `message`
- `data`

这使得：

- ResponsePolicy 能统一生成自然回复
- workflow condition 能统一读取 `data.field`

---

## 24. ResponsePolicy 与最终话术

### 24.1 ResponsePolicy 职责

`ResponsePolicy` 是统一话术策略层，负责：

- `ask_for_slot`
- `workflow_result`
- `workflow_summary`
- `ask_for_confirmation`
- `confirm_retry`
- `confirm_cancelled`
- `step_skipped`
- `ack`
- `reject`
- `short_social`
- `open_social_fallback`
- `with_pending_hint`
- `pending_task_hint`
- `task_stopped`
- `clarify`
- `fallback`

### 24.2 workflow_summary 的特殊点

多步结果不是简单拼接，而会做一定“车机式口语化聚合”，例如：

- `车窗已经打开了`
- `空调也调到 20 度了`
- `也开始播放 民谣 了`

最终形态类似：

- `好，车窗已经打开了，空调也调到 20 度了。`

### 24.3 clarify / reject / fallback 区别

`clarify`

- 已知能力内
- 但候选有歧义

`reject`

- 当前判断超能力边界
- 或 planner 明确指出 out of scope

`fallback`

- 没有稳定理解
- 也没有足够候选可澄清

---

## 25. Debug 与可观测性

当前项目对调试非常友好，主要体现在：

- route 每个 stage 都保留详细 debug
- rewrite 会带 applied / reason / metadata
- planner 会追加 planner stage 到 routing_debug
- clause_analysis 会放入 planner metadata
- multi_intent_detector 原始 top scores 会保留
- processing breakdown 会附带每阶段耗时

因此一次响应通常能回放出：

1. 原始输入
2. 是否改写
3. route 经过哪些 stage
4. 每个 stage 给了什么 candidates
5. fusion 为什么 execute / clarify / reject / route_to_cloud
6. planner 是否触发
7. planner 为什么 accepted / rejected
8. workflow 长什么样
9. 每步执行结果如何

---

## 26. 当前支持的核心能力

### 26.1 客服域

- 查订单
- 查物流
- 取消订单
- 转人工

### 26.2 座舱域

- 导航去某地
- 结束导航
- 打开/关闭空调
- 调空调温度
- 调大/调小风量
- 前挡除雾
- 后挡除雾
- 打开/关闭车窗
- 打开/关闭天窗
- 锁/解锁车门
- 播放/暂停音乐
- 上一首/下一首
- 调大/调小音量
- 静音
- 打开/关闭车灯
- 打开/关闭座椅加热
- 折叠/展开后视镜
- 打开/关闭雨刷

### 26.3 对话辅助能力

- 问候
- 感谢
- 再见
- 能力问答
- 开放闲聊
- 多轮短句改写
- 条件执行
- 高风险确认
- 停止当前任务

---

## 27. 分支矩阵

### 27.1 输入到输出的主要分支

| 输入类型 | 前置判断 | 走向 | 最终输出类型 |
| --- | --- | --- | --- |
| 停止词 + 有活动任务 | stop rule | reset task | `text/stopped` |
| 问候/感谢/闲聊 | social route | social responder | `text/social` |
| 单命令高置信 | fusion execute | 单步执行 | `workflow_result` |
| 单命令缺槽位 | execute + missing slots | ask slot | `ask_slot` |
| 多命令/条件句 | planner accepted | workflow | `workflow_result / ask_slot / ask_confirmation` |
| 本地歧义 | fusion clarify | clarify | `clarify` |
| 本地像已知意图但不稳 | route_to_cloud | clarify / fallback / reject | `clarify/fallback/reject` |
| 完全未知 | reject | reject | `reject` |
| session 丢失或无 current intent 的 fill-slots | invalid continuation | fallback | `fallback` |

### 27.2 route_to_cloud 的三种最终表现

当 fusion 决策为 `route_to_cloud` 时，不一定真的“去云端执行完就返回结果”，当前表现取决于 planner stage 和候选情况：

1. planner 明确提示 out-of-scope
   - 返回 `reject`
2. 还有可用候选 intent
   - 返回 `clarify`
3. 没候选也没明确拒绝
   - 返回 `fallback`

---

## 28. 当前技术栈与模型清单

### 28.1 基础框架

- Python
- FastAPI
- Pydantic Settings
- YAML / JSON 配置文件

### 28.2 本地理解层

- keyword matcher
- heuristic retrieval matcher
- mock classifier
- local BERT classifier
- local multi-label BERT detector

### 28.3 云端能力层

- DashScope OpenAI-compatible chat completion
- 云端 workflow planner
- 云端 social responder

### 28.4 状态与数据层

- In-memory session
- Redis session

### 28.5 执行层

- PluginRegistry
- MockPluginExecutor

### 28.6 前端与调试

- demo.html
- runtime backend 切换
- routing debug panel
- workflow JSON 展示

---

## 29. 真实请求样例时序

### 29.1 单命令：打开车窗

```mermaid
sequenceDiagram
    participant U as 用户
    participant API as FastAPI
    participant S as AgentService
    participant R as Router
    participant P as Plugin

    U->>API: /chat 打开车窗
    API->>S: handle_chat
    S->>S: get/create session
    S->>S: rewrite(无改写)
    S->>R: route
    R-->>S: decision=execute,intent=cabin_window_open
    S->>S: extract slots(空)
    S->>P: plugin.cabin.window.open
    P-->>S: success + message
    S-->>API: workflow_result
    API-->>U: 好的，已打开车窗
```

### 29.2 条件句：查订单，如果没发货就取消

```mermaid
sequenceDiagram
    participant U as 用户
    participant S as AgentService
    participant PL as Planner
    participant W as Workflow
    participant P as Plugin

    U->>S: 查订单A123，如果没发货就取消
    S->>PL: planner.plan
    PL-->>S: conditional workflow
    S->>W: build workflow
    W->>P: query_order
    P-->>W: data.order_status=pending_shipment
    W-->>U: ask_confirmation
    U->>S: 确认
    S->>W: continue workflow
    W->>P: cancel_order
    P-->>W: success
    W-->>U: workflow_summary
```

### 29.3 多轮补槽：导航去哪里

```mermaid
sequenceDiagram
    participant U as 用户
    participant S as AgentService
    participant P as Plugin

    U->>S: 导航
    S-->>U: ask_slot(请告诉我要去哪里)
    U->>S: 去公司
    S->>S: fill-slots
    S->>P: plugin.cabin.navigation(destination=公司)
    P-->>S: success
    S-->>U: workflow_result
```

---

## 30. 当前边界与限制

### 30.1 还不是完整语音链路

当前项目只有文本 Agent 服务：

- 没有内建 ASR
- 没有内建 TTS
- 没有真正的车机设备控制链路

### 30.2 plugin 仍然是 mock

虽然 plugin 结构已经有了，但当前执行结果仍然是 mock handler 返回，不是真实业务系统结果。

### 30.3 LLM matcher 尚未实现

`matcher_pipeline` 虽然支持 `llm` 这个 stage 名字，但实现仍是 placeholder。

### 30.4 流式 ack 与真实执行未完全打通

`chat-stream` 中 1 秒后的 `ack` 仍然依赖输入文本 token，不是插件真实启动事件。

### 30.5 多意图 detector 已独立训练，但泛化仍需继续补数据

目前多标签 detector 已接入为真正训练过的多标签模型，但对独立口语化长尾场景仍需要继续补强训练数据。

### 30.6 NER / Token Classification 尚未接入

当前动作-对象-槽位边界主要依赖：

- heuristic slot extraction
- clause heuristic
- classifier 与 detector 融合

还没有真正的 token classification 层。

---

## 31. 一句话总结当前项目真实流程

当前项目的真实运行方式可以概括为：

```text
FastAPI 接收请求
-> AgentService 读取 session
-> 先处理 stop / social
-> rewrite 做短句补全
-> keyword/classifier/retrieval 多阶段并行打分
-> fusion 决定 execute / clarify / reject / route_to_cloud
-> 复杂句触发 local-first planner
-> 生成 single/sequence/conditional workflow
-> 缺槽位就 ask_slot
-> 高风险就 ask_confirmation
-> PluginRegistry 执行
-> ResponsePolicy 生成最终自然回复
-> Session 持久化并返回完整 debug 与时延指标
```

---

## 32. 后续阅读建议

如果要继续往下深挖，建议按这个顺序看代码：

1. `app/main.py`
2. `app/core/bootstrap.py`
3. `app/services/agent_service.py`
4. `app/services/router.py`
5. `app/services/planner.py`
6. `app/services/classifier.py`
7. `app/services/multi_intent_detector.py`
8. `app/services/rewrite_engine.py`
9. `app/services/social.py`
10. `app/services/session_store.py`
11. `config/domain.yml`
12. `config/workflows.yml`